The Prompt Report: A Systematic Survey of Prompt Engineering Techniques 정리

The Prompt Report: A Systematic Survey of Prompt Engineering Techniques 정리

2026. 6. 2. 11:28ㆍ수집/IT

728x90

https://arxiv.org/abs/2406.06608

The Prompt Report: A Systematic Survey of Prompt Engineering Techniques

Generative Artificial Intelligence (GenAI) systems are increasingly being deployed across diverse industries and research domains. Developers and end-users interact with these systems through the use of prompting and prompt engineering. Although prompt eng

arxiv.org

8. Conclusions

Generative AI is a novel technology, and broader understanding of models’ capabilities and limitations remains limited. Natural language is a flexible, open-ended interface, with models having few obvious affordances. The use of Generative AI therefore inherits many of the standard challenges of linguistic communication—e.g., ambiguity, the role of context, the need for course correction— while at the same time adding the challenge of communicating with an entity whose “understanding” of language may not bear any substantial relationship to human understanding.
Many of the techniques described here have been called “emergent”, but it is perhaps more appropriate to say that they were discovered—the result of thorough experimentation, analogies from human reasoning, or pure serendipity.
The present work is an initial attempt to categorize the species of an unfamiliar territory. While we make every attempt to be comprehensive, there are sure to be gaps and redundancies. Our intention is to provide a taxonomy and terminology that cover a large number of existing prompt engineering techniques, and which can accommodate future methods. We discuss over 200 prompting techniques, frameworks built around them, and issues like safety and security that need to be kept in mind when using them. We also present two case studies in order to provide a clear sense of models’ capabilities and what it is like to tackle a problem in practice.
Last, our stance is primarily observational, and we make no claims to the validity of the presented techniques. The field is new, and evaluation is variable and unstandardized—even the most meticulous experimentation may suffer from unanticipated shortcomings, and model outputs themselves are sensitive to meaning-preserving changes in inputs. As a result, we encourage the reader to avoid taking any claims at face value and to recognize that techniques may not transfer to other models, problems, or datasets.
To those just beginning in prompt engineering, our recommendations resemble what one would recommend in any machine learning setting: understand the problem you are trying to solve (rather than just focusing on input/output and benchmark working with constitute a good representation of that problem. It is better to start with simpler approaches first, and to remain skeptical of claims about method performance. To those already engaged in prompt engineering, we hope that our taxonomy will shed light on the relationships between existing techniques. To those developing new techniques, we encourage situating new methods within our taxonomy, as well as including ecologically valid case studies and illustrations of those techniques.

생성형 AI는 아직 새로운 기술이며, 모델의 능력과 한계에 대한 광범위한 이해는 여전히 부족한 상태입니다. 자연어는 유연하고 개방적인 인터페이스이지만, 모델에게는 명확한 사용법(affordance)이 거의 없습니다. 따라서 생성형 AI를 활용하는 것은 언어적 소통의 일반적인 어려움—예를 들어 모호성(ambiguity), 맥락의 역할, 방향 수정의 필요성—을 그대로 물려받는 동시에, “언어를 이해한다”는 것이 인간의 이해와 실질적으로 큰 관련이 없을 수도 있는 존재와 소통해야 하는 새로운 도전을 추가합니다.
여기서 소개하는 많은 기법들은 “창발적(emergent)”이라고 불려왔지만, “발견된(discovered)” 것이라고 표현하는 것이 더 적절합니다. 이는 철저한 실험, 인간의 추론 방식으로부터의 유추, 혹은 순전한 우연(serendipity)의 결과였기 때문입니다.
본 연구는 아직 낯선 영역의 종(種)들을 분류하려는 초기 시도입니다. 최대한 포괄적으로 다루고자 노력했으나, 분명히 누락된 부분과 중복이 있을 것입니다. 우리의 목적은 기존의 수많은 프롬프트 엔지니어링 기법들을 포괄할 수 있는 분류 체계(taxonomy)와 용어를 제공하고, 향후 새로운 방법들도 수용할 수 있도록 하는 것입니다.
우리는 200개가 넘는 프롬프팅 기법들, 이를 바탕으로 한 프레임워크들, 그리고 이를 사용할 때 반드시 고려해야 할 안전(safety)과 보안(security) 문제 등을 다룹니다. 또한 모델의 실제 능력을 명확히 보여주고, 문제를 실무적으로 해결하는 과정이 어떤 것인지를 전달하기 위해 두 가지 사례 연구(case studies)를 제시합니다.
마지막으로, 우리의 입장은 주로 관찰적(observational)이며, 제시된 기법들의 유효성에 대해 어떠한 주장도 하지 않습니다. 이 분야는 아직 새롭고, 평가 방식도 다양하며 표준화되지 않았습니다. 가장 세심한 실험조차 예상치 못한 한계를 드러낼 수 있으며, 모델의 출력은 의미가 유지되는 입력 변화에도 매우 민감합니다. 따라서 독자 여러분께서는 어떠한 주장도 그대로 받아들이지 말고, 기법이 다른 모델, 문제, 데이터셋으로 일반화되지 않을 수 있음을 인지하시기 바랍니다.
프롬프트 엔지니어링을 이제 막 시작하는 분들께는 일반적인 머신러닝 상황에서 권장하는 것과 비슷한 조언을 드립니다: 해결하려는 문제를 제대로 이해하십시오(단순히 입출력이나 벤치마크 점수에만 집중하지 말고). 특히, 사용 중인 데이터와 지표가 그 문제를 잘 대표하는지 확인해야 합니다. 단순한 접근부터 시작하고, 방법론의 성능에 대한 주장에 대해서는 항상 회의적인 태도를 유지하는 것이 좋습니다.
이미 프롬프트 엔지니어링에 익숙한 분들께는, 우리의 분류 체계가 기존 기법들 간의 관계를 명확히 하는 데 도움이 되기를 바랍니다. 새로운 기법을 개발하는 분들께는, 개발한 방법을 본 분류 체계 안에 위치시키고, 생태학적으로 타당한(ecologically valid) 사례 연구와 구체적인 예시를 함께 제공할 것을 권장합니다.

728x90

'수집 > IT' 카테고리의 다른 글

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models 정리 (0)	2026.06.02
만들면서 배우는 생성 AI 정리 중 (2)	2025.06.15
한국어 임베딩 정리 (1)	2025.06.15
데이터 과학자 원칙 정리 (3)	2025.06.15
행동 데이터 분석 정리 중 (0)	2025.06.15

일체유심조

일체유심조

태그

최근글

댓글

공지사항

아카이브

https://arxiv.org/abs/2406.06608

8. Conclusions

'수집 > IT' 카테고리의 다른 글

관련글

티스토리툴바