Skip to content

Latest commit

ย 

History

History
84 lines (70 loc) ยท 4 KB

File metadata and controls

84 lines (70 loc) ยท 4 KB

DOU Flask ML Serving API Server

AI ๊ฐ์ • ๋ถ„์„ ๋„์šฐ๋ฏธ ์„œ๋น„์Šค, DOU์˜ ๊ฐ์ • ๋ถ„์„ ML ์„œ๋ฒ„ ๋ ˆํฌ์ง€ํ† ๋ฆฌ

PyTorch๋กœ ์ž‘์„ฑ๋œ ๋ชจ๋ธ์ด๋ฉฐ, ์‚ฌ์šฉํ•˜๊ธฐ์— Python ๊ธฐ๋ฐ˜ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ํŽธ๋ฆฌํ• ๊ฑฐ๋ผ ์ƒ๊ฐํ•˜์—ฌ, (GCP ํ”„๋ฆฌ ํ‹ฐ์–ด ์‚ฌ์šฉ์„ ํ†ตํ•œ ๊ธˆ์•ก ์ ˆ๊ฐ๋„ ์ด์œ ์— ์กด์žฌ)
์ด์™€ ๊ฐ™์ด Flask๋กœ ML Serving ์„œ๋ฒ„๋ฅผ ๋”ฐ๋กœ ๋งŒ๋“ค๊ฒŒ ๋˜์—ˆ๋‹ค.

๐Ÿ“ฝ๏ธ DOU ์‹œ์—ฐ ์˜์ƒ

๐Ÿ’ป DOU ๋ฐœํ‘œ ์ž๋ฃŒ

๐Ÿ“‹ ํ”„๋กœ์ ํŠธ ๊ฐœ์š”

ํ•œ๊ตญ์–ด ์Œ์„ฑ ๊ธฐ๋ฐ˜ ๊ฐ์ • ๋ถ„์„์„ ์œ„ํ•œ KoBERT ๊ธฐ๋ฐ˜ ๊ฐ์ • ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ์„œ๋น™ํ•˜๋Š” Flask API ์„œ๋ฒ„์ž…๋‹ˆ๋‹ค.

7๊ฐ€์ง€ ๊ฐ์ •(๊ธฐ์จ, ์Šฌํ””, ๋ถ„๋…ธ, ๋ถˆ์•ˆ, ๋†€๋žŒ, ๊บผ๋ฆผ, ์ค‘๋ฆฝ)์„ ๋ถ„๋ฅ˜ํ•˜๊ณ , DOU Backend ์„œ๋ฒ„์— REST API๋ฅผ ํ†ตํ•ด ๋ถ„์„ ๊ฒฐ๊ณผ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ™‹ ํŒ€ ๊ตฌ์„ฑ

์ด๋ฆ„ ๊นƒํ—ˆ๋ธŒ ์—ญํ• 
ํ•œ์ƒ์€ @silvarge ํŒ€์žฅ / ๊ธฐํš, ๋ฐฑ์—”๋“œ ๋ฐ ML ์ „์ฒด ๋‹ด๋‹น
๊น€๊ฐ€ํ˜„ @gahyunkim ๊ธฐํš, ํ”„๋ก ํŠธ์—”๋“œ(์•ˆ๋“œ๋กœ์ด๋“œ) ๋ฐ ๋””์ž์ธ ๋‹ด๋‹น

๐Ÿ› ๏ธ ๊ธฐ์ˆ  ์Šคํƒ

๋ถ„๋ฅ˜ ๋‚ด์šฉ
Framework Flask (Python)
ML Framework PyTorch
Pre-trained Model KoBERT
Deploy & Infra GitHub Actions, Docker, Google Cloud Platform (GCP VM)
Others Transformers, NumPy, Pandas, scikit-learn

๐Ÿš€ ์ฃผ์š” ๊ธฐ๋Šฅ

๊ฐ์ • ๋ถ„์„ ๋ชจ๋ธ

  • KoBERT ๊ธฐ๋ฐ˜ ํ•œ๊ตญ์–ด ๊ฐ์ • ๋ถ„๋ฅ˜: 7๊ฐ€์ง€ ๊ฐ์ • ์นดํ…Œ๊ณ ๋ฆฌ ๋ถ„๋ฅ˜ (๊ธฐ์จ, ์Šฌํ””, ๋ถ„๋…ธ, ๋ถˆ์•ˆ, ๋†€๋žŒ, ๊บผ๋ฆผ, ์ค‘๋ฆฝ)
  • ์ปค์Šคํ…€ ํ•™์Šต: ํ•œ๊ตญ์–ด ๊ฐ์ • ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํŒŒ์ธํŠœ๋‹
  • ๊ณ ์„ฑ๋Šฅ ์ถ”๋ก : ํ‰๊ท  F1-score 92% ๋‹ฌ์„ฑ
  • ์‹ค์‹œ๊ฐ„ ๋ถ„์„: REST API๋ฅผ ํ†ตํ•œ ์‹ค์‹œ๊ฐ„ ๊ฐ์ • ๋ถ„์„

API ์„œ๋น„์Šค

  • ๊ฐ์ • ๋ถ„์„ API ์ œ๊ณต: ํ…์ŠคํŠธ ์ž…๋ ฅ ์‹œ ๊ฐ์ • ๋ถ„๋ฅ˜ ๋ฐ ์‹ ๋ขฐ๋„ ๋ฐ˜ํ™˜
  • ๋ชจ๋ธ ์ƒํƒœ ํ™•์ธ: ๋ชจ๋ธ ๋กœ๋”ฉ ์ƒํƒœ ๋ฐ ์„œ๋ฒ„ ํ—ฌ์Šค์ฒดํฌ
  • ๋ฐฐ์น˜ ์ฒ˜๋ฆฌ: ๋‹ค์ค‘ ํ…์ŠคํŠธ ์ผ๊ด„ ๋ถ„์„ ์ง€์›

๐Ÿ› ๋ฌธ์ œ ํ•ด๊ฒฐ ๋ฐ ๊ฐœ์„ ์‚ฌํ•ญ

ํ•ด๊ฒฐ๋œ ์ฃผ์š” ์ด์Šˆ

  • Keras ๊ธฐ๋ฐ˜ ๋ชจ๋ธ ๊ฐœ์„  ์ž‘์—… ์ค‘ Keras ๋ฒ„์ „ ์—…๋ฐ์ดํŠธ ๋ฐ monologg/kobert ์—…๋ฐ์ดํŠธ๋กœ ์ธํ•œ ํ˜ธํ™˜์„ฑ ์˜ค๋ฅ˜ ๋ฐœ์ƒ
    • KerasTensor โ†’ Tensor ๋ณ€ํ™˜ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•ด๋„ ์ง€์†์ ์œผ๋กœ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜์—ฌ, ์ œํ•œ๋œ ์‹œ๊ฐ„ ๋‚ด ์•ˆ์ •์  ๊ฐœ์„ ์„ ์œ„ํ•ด PyTorch ๊ธฐ๋ฐ˜ KoBERT๋กœ ๋ชจ๋ธ ์žฌ์„ค๊ณ„
    • ์žฌ์„ค๊ณ„ ํ›„, ๊ธฐ์กด์— ์ •ํ™•๋„๊ฐ€ ๋‚ฎ์•˜๋˜ ํŠน์ • ๊ฐ์ •(์ค‘๋ฆฝ, ๊บผ๋ฆผ, ๋ถ„๋…ธ) ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์ด ํฌ๊ฒŒ ๊ฐœ์„ ๋จ (f1-score ๊ธฐ์ค€ 60% ๋Œ€ -> 80% ๋Œ€)
  • ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์ตœ์ ํ™” ๋ถ€์กฑ
    • ๊ธฐ์กด์—๋Š” ํŠœ๋‹์ด ๋ฏธํกํ•˜์—ฌ ์„ฑ๋Šฅ ํŽธ์ฐจ ๋ฐœ์ƒ
    • Grid Search๋ฅผ ํ™œ์šฉํ•ด Learning Rate, Batch Size, Dropout ๋น„์œจ์„ ์ฒด๊ณ„์ ์œผ๋กœ ํƒ์ƒ‰
    • ๋ชจ๋“  ๊ฐ์ • ์นดํ…Œ๊ณ ๋ฆฌ์—์„œ f1-score 90% ์ด์ƒ ๋‹ฌ์„ฑ

๐Ÿง  ๋ชจ๋ธ ์ƒ์„ธ

KoBERT ๊ธฐ๋ฐ˜ ๊ฐ์ • ๋ถ„๋ฅ˜ ๋ชจ๋ธ

image

ํ•˜์ดํผ ํŒŒ๋ผ๋ฏธํ„ฐ

  • Learning Rate: 5e-5
  • Batch Size: 32
  • Max Length: 128
  • Dropout Rate: 0.55
  • Epochs: 4
  • Optimizer: AdamW

์ธํ”„๋ผ ๊ตฌ์กฐ

image

๐Ÿ“‚ ํ”„๋กœ์ ํŠธ ๊ตฌ์กฐ

DOU_Backend/
โ”œโ”€โ”€ .github/workflows/     # GitHub Actions CI/CD
โ”œโ”€โ”€ .platform/nginx/       # Elastic Beanstalk ์„ค์ •
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ apis/
โ”‚   โ”‚   โ”œโ”€โ”€ auths/         # ์ธ์ฆ/์ธ๊ฐ€ (์นด์นด์˜ค OAuth, JWT)
โ”‚   โ”‚   โ”œโ”€โ”€ sentiments/    # ๊ฐ์ • ๋ถ„์„ API ์—ฐ๋™
โ”‚   โ”‚   โ”œโ”€โ”€ records/       # ๋Œ€ํ™” ๊ธฐ๋ก ๋ฐ ๊ฐ์ • ํžˆ์Šคํ† ๋ฆฌ
โ”‚   โ”‚   โ”œโ”€โ”€ gpt/          # GPT ์‘๋‹ต ์ฒ˜๋ฆฌ
โ”‚   โ”‚   โ””โ”€โ”€ users/        # ์‚ฌ์šฉ์ž ๊ด€๋ฆฌ
โ”‚   โ”œโ”€โ”€ commons/          # ๊ณตํ†ต ๋ชจ๋“ˆ (์˜ˆ์™ธ์ฒ˜๋ฆฌ, ๋กœ๊ฑฐ, Swagger)
โ”‚   โ””โ”€โ”€ main.ts
โ”œโ”€โ”€ test/                 # E2E ํ…Œ์ŠคํŠธ
โ””โ”€โ”€ package.json