06 Transforming

원문

번역

Large language models are very good at transforming its input to a different format, such as inputting a piece of text in one language and transforming it or translating it to a different language, or helping with spelling and grammar corrections. So taking as input a piece of text that may not be fully grammatical and helping you to fix that up a bit, or even transforming formats, such as inputting HTML and outputting JSON. So there's a bunch of applications that I used to write somewhat painfully with a bunch of regular expressions that would definitely be much more simply implemented now with a large language model and a few prompts. Yeah, I use ChatGPT to proofread pretty much everything I write these days, so I'm excited to show you some more examples in the notebook now. So first we'll import OpenAI and also use the same get_completion helper function that we've been using throughout the videos. And the first thing we'll do is a translation task. So large language models are trained on a lot of text from kind of many sources, a lot of which is the internet, and this is kind of, of course, in many different languages. So this kind of imbues the model with the ability to do translation. And these models know kind of hundreds of languages to varying degrees of proficiency. And so we'll go through some examples of how to use this capability. So let's start off with something simple.

대규모 언어 모델은 입력을 다른 형식으로 변환하는 데 매우 능숙합니다. 예를 들어 한 언어로 텍스트를 입력하고 이를 변환하거나 다른 언어로 번역하거나 맞춤법 및 문법 수정을 돕습니다. 따라서 완전히 문법적이지 않을 수 있는 텍스트를 입력으로 받아 이를 약간 수정하는 데 도움을 주거나 HTML 입력 및 JSON 출력과 같은 형식을 변환할 수도 있습니다. 따라서 대규모 언어 모델과 몇 가지 프롬프트로 훨씬 더 간단하게 구현될 수 있는 많은 정규 표현식으로 다소 힘들게 작성했던 많은 애플리케이션이 있습니다. 예, 저는 요즘 작성하는 거의 모든 것을 교정하는 데 ChatGPT를 사용하므로 이제 노트북에서 더 많은 예를 보여드리게 되어 기쁩니다. 따라서 먼저 OpenAI를 가져오고 동영상 전체에서 사용했던 것과 동일한 get_completion 도우미 기능을 사용합니다. 그리고 가장 먼저 할 일은 번역 작업입니다. 따라서 대규모 언어 모델은 다양한 출처의 많은 텍스트에 대해 교육을 받습니다. 그 중 많은 부분이 인터넷이고 이것은 물론 다양한 언어로 되어 있습니다. 따라서 이러한 종류의 번역 기능을 모델에 부여합니다. 그리고 이러한 모델은 다양한 수준의 숙달도에 따라 수백 가지 언어를 알고 있습니다. 따라서 이 기능을 사용하는 방법에 대한 몇 가지 예를 살펴보겠습니다. 간단한 것부터 시작해 봅시다.

So in this first example, the prompt is translate the following English text to Spanish. "Hi, I would like to order a blender". And the response is "Hola, me gustaría ordenar una licuadora". And I'm very sorry to all of you Spanish speakers. I never learned Spanish, unfortunately, as you can definitely tell. Okay, let's try another example. So in this example, the prompt is "Tell me what language this is". And then this is in French, "Combien coûte le lampadaire". And so let's run this. And the model has identified that "This is French." The model can also do multiple translations at once. So in this example, let's say translate the following text to French and Spanish. And you know what? Let's add another an English pirate. And the text is "I want to order a basketball". So here we have French, Spanish and English pirate. So in some languages, the translation can change depending on the speaker's relationship to the listener. And you can also explain this to the language model. And so it will be able to kind of translate accordingly. So in this example, we say, "Translate the following text to Spanish in both the formal and informal forms". "Would you like to order a pillow?" And also notice here we're using a different delimiter than these backticks. It doesn't really matter as long as there's kind of a clear separation. So here we have the formal and informal. So formal is when you're speaking to someone who's maybe senior to you or you're in a professional situation. That's when you use a formal tone and then informal is when you're speaking to maybe a group of friends. I don't actually speak Spanish but my dad does and he says that this is correct. So for the next example, we're going to pretend that we're in charge of a multinational e-commerce company and so the user messages are going to be in all different languages and so users are going to be telling us about their IT issues in a wide variety of languages.

따라서 이 첫 번째 예에서 프롬프트는 다음 영어 텍스트를 스페인어로 번역하는 것입니다. "안녕하세요, 블렌더를 주문하고 싶습니다." 응답은 "Hola, me gustaría ordenar una licuadora"입니다. 그리고 스페인어 사용자 모두에게 매우 유감입니다. 아시다시피 안타깝게도 스페인어를 배운 적이 없습니다. 좋아, 다른 예를 들어보자. 따라서 이 예에서 프롬프트는 "Tell me what language this is"입니다. 그리고 이것은 프랑스어로 "Combien coûte le lampadaire"입니다. 이제 이것을 실행해 봅시다. 그리고 모델은 "이것은 프랑스인입니다."라고 식별했습니다. 모델은 한 번에 여러 번역을 수행할 수도 있습니다. 따라서 이 예에서는 다음 텍스트를 프랑스어와 스페인어로 번역한다고 가정해 보겠습니다. 그리고 그거 알아? 다른 영국 해적을 추가합시다. 그리고 텍스트는 "농구를 주문하고 싶습니다"입니다. 여기 프랑스, 스페인, 영국 해적이 있습니다. 따라서 일부 언어에서는 화자와 청자의 관계에 따라 번역이 변경될 수 있습니다. 그리고 이를 언어 모델에 설명할 수도 있습니다. 따라서 그에 따라 번역할 수 있습니다. 따라서 이 예에서는 "공식 및 비공식 형식 모두에서 다음 텍스트를 스페인어로 번역합니다"라고 말합니다. "베개를 주문하시겠습니까?" 또한 여기에서 백틱과 다른 구분 기호를 사용하고 있음을 알 수 있습니다. 명확한 구분이 있는 한 사실상 문제가 되지 않습니다. 따라서 여기에 공식 및 비공식이 있습니다. 그래서 격식을 차리는 것은 당신보다 선배이거나 당신이 전문적인 상황에 있는 사람과 이야기할 때입니다. 격식을 차린 어조를 사용하고 친구 그룹과 이야기할 때는 비공식을 사용합니다. 저는 실제로 스페인어를 할 줄 모르지만 아버지는 스페인어를 하시며 이것이 맞다고 말씀하십니다. 따라서 다음 예에서는 다국적 전자상거래 회사를 담당하고 있다고 가정하여 사용자 메시지가 모든 다른 언어로 작성되어 사용자가 자신의 IT에 대해 알려줄 것입니다. 다양한 언어로 된 문제

So we need a universal translator. So first we'll just paste in a list of user messages in a variety of different languages. And now we will loop through each of these user messages. So "for issue in user_messages". And then I'm going to copy over this slightly longer code block. And so the first thing we'll do is ask the model to tell us what language the issue is in. So here's the prompt. Then we'll print out the original message's language and the issue. And then we'll ask the model to translate it into English and Korean. So let's run this. So the original message in French. So we have a variety of languages and then the model translates them into English and then Korean. And you can kind of see here, so the model says, "This is French". So that's because the response from this prompt is going to be "This is French". You could try editing this prompt to say something like tell me what language this is, respond with only one word or don't use a sentence, that kind of thing. If you wanted this to just be one word. Or you could ask for it in a JSON format or something like that, which would probably encourage it to not use a whole sentence. And so amazing, you've just built a universal translator. And also feel free to pause the video and add kind of any other languages you want to try here. Maybe languages you speak yourself and see how the model does. So, the next thing we're going to dive into is tone transformation. Writing can vary based on an intended audience, you know, the way that I would write an email to a colleague or a professor is obviously going to be quite different to the way I text my younger brother. And so, ChatGPT can actually also help produce different tones.

그래서 범용 번역기가 필요합니다. 먼저 다양한 언어로 된 사용자 메시지 목록을 붙여넣습니다. 이제 이러한 각 사용자 메시지를 반복합니다. 따라서 "user_messages의 문제"입니다. 그런 다음 이 약간 더 긴 코드 블록을 복사하겠습니다. 그래서 가장 먼저 할 일은 모델에게 문제가 어떤 언어로 되어 있는지 알려주도록 요청하는 것입니다. 여기 프롬프트가 있습니다. 그런 다음 원본 메시지의 언어와 문제를 인쇄합니다. 그런 다음 모델에게 영어와 한국어로 번역하도록 요청합니다. 그럼 이것을 실행해 봅시다. 따라서 원래 메시지는 프랑스어입니다. 그래서 우리는 다양한 언어를 가지고 있고 모델은 그것을 영어로 번역한 다음 한국어로 번역합니다. 여기에서 볼 수 있듯이 모델이 "이것은 프랑스식입니다"라고 말합니다. 이 프롬프트의 응답이 'This is French'이기 때문입니다. 이 프롬프트를 편집하여 이 언어가 무엇인지 알려달라거나, 한 단어로만 응답하거나, 문장을 사용하지 말거나 등의 말을 할 수 있습니다. 당신이 이것을 단지 한 단어로 원했다면. 또는 전체 문장을 사용하지 않도록 권장하는 JSON 형식 등으로 요청할 수 있습니다. 정말 놀라운 것은 방금 만능 번역기를 구축했다는 것입니다. 또한 동영상을 일시중지하고 여기에서 시도하고 싶은 다른 언어를 추가할 수도 있습니다. 자신이 말하는 언어를 보고 모델이 어떻게 하는지 볼 수도 있습니다. 그래서 다음으로 살펴볼 것은 톤 변환입니다. 글은 대상에 따라 다를 수 있습니다. 동료나 교수에게 이메일을 쓰는 방식은 분명히 동생에게 문자를 보내는 방식과 상당히 다를 것입니다. 따라서 ChatGPT는 실제로 다양한 톤을 생성하는 데 도움이 될 수도 있습니다.

So, let's look at some examples. So, in this first example, the prompt is "Translate the following from slang to a business letter". "Dude, this is Joe, check out this spec on the standing lamp." So, let's execute this. And as you can see, we have a much more formal business letter with a proposal for a standing lamp specification. The next thing that we're going to do is to convert between different formats. ChatGPT is very good at translating between different formats such as JSON to HTML, you know, XML, all kinds of things. Markdown. And so, in the prompt, we'll describe both the input and the output formats. So, here is an example. So, we have this JSON that contains a list of restaurant employees with their name and email. And then in the prompt, we're going to ask the model to translate this from JSON to HTML. So, the prompt is "Translate the following Python dictionary from JSON to an HTML table with column headers and title". And then we'll get the response from the model and print it. So, here we have some HTML displaying all of the employee names and emails. And so, now let's see if we can actually view this HTML. So, we're going to use this display function from this Python library, "display (HTML(response))". And here you can see that this is a properly formatted HTML table. The next transformation task we're going to do is spell check and grammar checking. And this is a really kind of popular use for ChatGPT.

몇 가지 예를 살펴보겠습니다. 따라서 이 첫 번째 예에서 프롬프트는 '다음을 속어에서 비즈니스 서신으로 번역'입니다. "야, 나는 조야, 스탠딩 램프에서 이 사양을 확인해봐." 자, 이것을 실행해 봅시다. 보시다시피 스탠딩 램프 사양에 대한 제안이 포함된 훨씬 더 공식적인 비즈니스 레터가 있습니다. 다음으로 할 일은 서로 다른 형식 간에 변환하는 것입니다. ChatGPT는 JSON에서 HTML, XML, 모든 종류의 형식과 같은 다양한 형식 간 번역에 매우 능숙합니다. 가격 인하. 따라서 프롬프트에서 입력 및 출력 형식을 모두 설명합니다. 여기 예가 있습니다. 따라서 레스토랑 직원의 이름과 이메일 목록이 포함된 JSON이 있습니다. 그런 다음 프롬프트에서 이를 JSON에서 HTML로 변환하도록 모델에 요청합니다. 따라서 프롬프트는 '다음 Python 사전을 JSON에서 열 헤더 및 제목이 있는 HTML 테이블로 번역'입니다. 그런 다음 모델로부터 응답을 받아 인쇄합니다. 따라서 여기에 모든 직원 이름과 이메일을 표시하는 HTML이 있습니다. 이제 이 HTML을 실제로 볼 수 있는지 봅시다. 따라서 이 Python 라이브러리의 표시 기능인 'display(HTML(response))'를 사용하겠습니다. 여기에서 이것이 올바른 형식의 HTML 테이블임을 알 수 있습니다. 다음 변환 작업은 맞춤법 검사와 문법 검사입니다. 그리고 이것은 ChatGPT에서 정말 인기 있는 용도입니다.

I highly recommend doing this, I do this all the time. And it's especially useful when you're working in a non-native language. And so here are some examples of some common grammar and spelling problems and how the language model can help address these. So I'm going to paste in a list of sentences that have some grammatical or spelling errors. And then we're going to loop through each of these sentences and ask the model to proofread these. Proofread and correct. And then we'll use some delimiters. And then we will get the response and print it as usual. And so the model is able to correct all of these grammatical errors. We could use some of the techniques that we've discussed before. So we could, to improve the prompt, we could say proofread and correct the following text. And rewrite. And rewrite the whole. And rewrite it. Corrected version. If you don't find any errors, just say no errors found. Let's try this. So this way we were able to, oh, they're still using quotes here. But you can imagine you'd be able to find a way with a little bit of iterative prompt development. To kind of find a prompt that works more reliably every single time. And so now we'll do another example. It's always useful to check your text before you post it in a public forum. And so we'll go through an example of checking a review. And so here is a review about a stuffed panda. And so we're going to ask the model to proofread and correct the review. Great. So we have this corrected version. And one cool thing we can do is find the kind of differences between our original review and the model's output. So we're going to use this redlines Python package to do this.

이렇게 하는 것이 좋습니다. 저는 항상 이렇게 합니다. 모국어가 아닌 언어로 작업할 때 특히 유용합니다. 그래서 다음은 몇 가지 일반적인 문법 및 철자 문제와 언어 모델이 이러한 문제를 해결하는 데 어떻게 도움이 되는지에 대한 몇 가지 예입니다. 그래서 문법이나 맞춤법 오류가 있는 문장 목록을 붙여넣겠습니다. 그런 다음 각 문장을 반복하고 모델에게 이를 교정하도록 요청합니다. 교정하고 수정하십시오. 그런 다음 몇 가지 구분 기호를 사용합니다. 그런 다음 응답을 받고 평소와 같이 인쇄합니다. 따라서 모델은 이러한 모든 문법 오류를 수정할 수 있습니다. 이전에 논의한 몇 가지 기술을 사용할 수 있습니다. 따라서 프롬프트를 개선하기 위해 다음 텍스트를 교정하고 수정한다고 말할 수 있습니다. 그리고 다시 작성하십시오. 그리고 전체를 다시 작성하십시오. 그리고 다시 작성하십시오. 수정된 버전입니다. 오류를 찾지 못한 경우 오류가 발견되지 않았다고 말하세요. 이것을 해보자. 그래서 이런 식으로 우리는 오, 그들은 여기서 여전히 따옴표를 사용하고 있습니다. 하지만 약간의 반복적인 프롬프트 개발로 방법을 찾을 수 있을 것이라고 상상할 수 있습니다. 매번 더 안정적으로 작동하는 프롬프트를 찾기 위해. 이제 다른 예를 들어보겠습니다. 공개 포럼에 게시하기 전에 텍스트를 확인하는 것이 항상 유용합니다. 리뷰를 확인하는 예를 살펴보겠습니다. 여기 팬더 박제에 대한 리뷰가 있습니다. 그래서 우리는 모델에게 검토를 교정하고 수정하도록 요청할 것입니다. 엄청난. 그래서 우리는 이 수정된 버전을 가지고 있습니다. 그리고 우리가 할 수 있는 한 가지 멋진 일은 원래 리뷰와 모델의 출력 간의 차이점을 찾는 것입니다. 그래서 이 redlines Python 패키지를 사용하여 이를 수행할 것입니다.

And we're going to get the diff between the original text of our review and the model output and then display this. And so here you can see the diff between the original review and the model output and the kind of things that have been corrected. So, the prompt that we used was, "proofread and correct this review". But you can also make kind of more dramatic changes, changes to tone, and that kind of thing. So, let's try one more thing. So, in this prompt, we're going to ask the model to proofread and correct this same review, but also make it more compelling and ensure that it follows APA style and targets an advanced reader. And we're also going to ask for the output in markdown format. And so we're using the same text from the original review up here. So, let's execute this. And here we have an expanded APA style review of the softpanda. So, this is it for the transforming video. Next up, we have "Expanding", where we'll take a shorter prompt and kind of generate a longer, more freeform response from a language model.

그리고 리뷰의 원본 텍스트와 모델 출력 간의 차이점을 얻은 다음 이를 표시할 것입니다. 여기에서 원본 리뷰와 모델 출력 간의 차이점과 수정된 종류를 확인할 수 있습니다. 그래서 우리가 사용한 프롬프트는 '이 리뷰를 교정하고 수정하세요'였습니다. 하지만 좀 더 드라마틱한 변화, 톤의 변화 등을 만들 수도 있습니다. 그래서 한 가지 더 시도해 봅시다. 따라서 이 프롬프트에서 우리는 모델에게 이 동일한 리뷰를 교정하고 수정하도록 요청하지만 더 설득력 있게 만들고 APA 스타일을 따르고 고급 독자를 대상으로 하는지 확인합니다. 또한 출력을 마크다운 형식으로 요청합니다. 따라서 여기에 있는 원본 리뷰의 동일한 텍스트를 사용하고 있습니다. 자, 이것을 실행해 봅시다. 여기 softpanda에 대한 확장된 APA 스타일 리뷰가 있습니다. 자, 변신 영상은 여기까지입니다. 다음으로 '확장'이 있습니다. 여기서는 더 짧은 프롬프트를 사용하여 언어 모델에서 더 길고 더 자유로운 형식의 응답을 생성합니다.

페이지 트리

06 Transforming

원문

번역