diff --git a/content/posts/TPU-deep-dive/index.en.md b/content/posts/TPU-deep-dive/index.en.md
index e260208..b4eca42 100644
--- a/content/posts/TPU-deep-dive/index.en.md
+++ b/content/posts/TPU-deep-dive/index.en.md
@@ -64,7 +64,12 @@ As mentioned earlier, TPU was designed specifically for AI operations. The bigge
TPU uses a special unit called Systolic array, which cannot be found in general processors (CPUs), to efficiently execute this matrix multiplication. The term "Systolic" is derived from "systole," the contraction phase of the heart. Just as the heart rhythmically beats and sends blood to various parts of the body, data moves rhythmically and regularly between computational units within the array structure, performing operations - hence the name. Systolic array optimizes data flow and maximizes parallel processing, making it efficient for large-scale operations like matrix multiplication. The process of Systolic array performing matrix multiplication can be visualized as an animation below.
-
+{{< rawhtml >}}
+
+{{< /rawhtml >}}
Next, to explain the effectiveness of Systolic array in more detail, let's compare the operation method of general processors with TPU's systolic array operation method.
diff --git a/content/posts/TPU-deep-dive/index.ko.md b/content/posts/TPU-deep-dive/index.ko.md
index c4dd6ae..5d39d02 100644
--- a/content/posts/TPU-deep-dive/index.ko.md
+++ b/content/posts/TPU-deep-dive/index.ko.md
@@ -61,7 +61,13 @@ TPU 구조를 이해하기 위해서는 먼저 TPU가 개발된 배경에 대해
TPU에서는 이 행렬 곱셈을 효율적으로 실행할 수 있도록 일반적인 프로세서(CPU)에서는 볼 수 없는 Systolic array라는 특별한 유닛을 사용합니다. "Systolic"은 심장의 수축 운동인 '수축기(systole)'에서 유래한 단어입니다. 마치 심장이 규칙적으로 박동하며 혈액을 신체의 각 부분으로 보내는 것처럼, 배열 구조 내에서 데이터가 연산 유닛 사이를 리듬감 있고 규칙적으로 이동하며 연산이 수행되는 모습에서 착안된 이름입니다. Systolic array는 데이터 흐름을 최적화하고 병렬 처리를 극대화하여 행렬 곱셈과 같은 대규모 연산에 효율적입니다. Systolic array가 행렬 곱셈을 진행하는 과정을 애니메이션으로 나타내보면 아래와 같습니다.
-
+{{< rawhtml >}}
+
+{{< /rawhtml >}}
+
다음으로는 Systolic array의 효과를 더 구체적으로 설명하기 위해 일반적인 프로세서의 연산 방식과 TPU의 systolic array를 사용한 연산 방식을 비교해보겠습니다.

diff --git a/content/posts/TPU-deep-dive/systolic_array.gif b/content/posts/TPU-deep-dive/systolic_array.gif
deleted file mode 100644
index c7369fe..0000000
Binary files a/content/posts/TPU-deep-dive/systolic_array.gif and /dev/null differ
diff --git a/content/posts/TPU-deep-dive/systolic_array.mp4 b/content/posts/TPU-deep-dive/systolic_array.mp4
new file mode 100644
index 0000000..9da46cb
Binary files /dev/null and b/content/posts/TPU-deep-dive/systolic_array.mp4 differ
diff --git a/content/posts/how-GPU-works/images/01-gpu-hopper.png b/content/posts/how-GPU-works/images/01-gpu-hopper.png
index cb3a83d..87cf509 100644
Binary files a/content/posts/how-GPU-works/images/01-gpu-hopper.png and b/content/posts/how-GPU-works/images/01-gpu-hopper.png differ
diff --git a/content/posts/how-GPU-works/images/02-rise-of-nvidia.png b/content/posts/how-GPU-works/images/02-rise-of-nvidia.png
index 9ab66a0..a79f158 100644
Binary files a/content/posts/how-GPU-works/images/02-rise-of-nvidia.png and b/content/posts/how-GPU-works/images/02-rise-of-nvidia.png differ
diff --git a/content/posts/how-GPU-works/images/04-cui-gui.png b/content/posts/how-GPU-works/images/04-cui-gui.png
index 46f2f8c..e52ed4f 100644
Binary files a/content/posts/how-GPU-works/images/04-cui-gui.png and b/content/posts/how-GPU-works/images/04-cui-gui.png differ
diff --git a/content/posts/how-GPU-works/images/05-3d-games.png b/content/posts/how-GPU-works/images/05-3d-games.png
index cb24509..18e6ba9 100644
Binary files a/content/posts/how-GPU-works/images/05-3d-games.png and b/content/posts/how-GPU-works/images/05-3d-games.png differ
diff --git a/content/posts/how-GPU-works/images/06-graphics-pipeline.png b/content/posts/how-GPU-works/images/06-graphics-pipeline.png
index dd21121..574116a 100644
Binary files a/content/posts/how-GPU-works/images/06-graphics-pipeline.png and b/content/posts/how-GPU-works/images/06-graphics-pipeline.png differ
diff --git a/content/posts/how-GPU-works/images/07-fx-arch.png b/content/posts/how-GPU-works/images/07-fx-arch.png
index a18ffa6..f25f470 100644
Binary files a/content/posts/how-GPU-works/images/07-fx-arch.png and b/content/posts/how-GPU-works/images/07-fx-arch.png differ
diff --git a/content/posts/how-GPU-works/images/08-tesla-arch.png b/content/posts/how-GPU-works/images/08-tesla-arch.png
index d679c92..d739103 100644
Binary files a/content/posts/how-GPU-works/images/08-tesla-arch.png and b/content/posts/how-GPU-works/images/08-tesla-arch.png differ
diff --git a/content/posts/how-GPU-works/images/09-pre-gpgpu.png b/content/posts/how-GPU-works/images/09-pre-gpgpu.png
index 10b3844..e2ff75a 100644
Binary files a/content/posts/how-GPU-works/images/09-pre-gpgpu.png and b/content/posts/how-GPU-works/images/09-pre-gpgpu.png differ
diff --git a/content/posts/how-GPU-works/images/10-SL-CUDA.png b/content/posts/how-GPU-works/images/10-SL-CUDA.png
index 2400ad8..3eac93d 100644
Binary files a/content/posts/how-GPU-works/images/10-SL-CUDA.png and b/content/posts/how-GPU-works/images/10-SL-CUDA.png differ
diff --git a/content/posts/how-GPU-works/images/11-hopper-full.png b/content/posts/how-GPU-works/images/11-hopper-full.png
index a32f510..030bc7e 100644
Binary files a/content/posts/how-GPU-works/images/11-hopper-full.png and b/content/posts/how-GPU-works/images/11-hopper-full.png differ
diff --git a/content/posts/how-GPU-works/images/12-hopper-sm.png b/content/posts/how-GPU-works/images/12-hopper-sm.png
index 896196b..68bcb8d 100644
Binary files a/content/posts/how-GPU-works/images/12-hopper-sm.png and b/content/posts/how-GPU-works/images/12-hopper-sm.png differ
diff --git a/content/posts/how-GPU-works/images/13-cuda-pm.png b/content/posts/how-GPU-works/images/13-cuda-pm.png
index 67a54cb..3c7eb0b 100644
Binary files a/content/posts/how-GPU-works/images/13-cuda-pm.png and b/content/posts/how-GPU-works/images/13-cuda-pm.png differ
diff --git a/content/posts/how-GPU-works/images/14-thdblk-alloc.png b/content/posts/how-GPU-works/images/14-thdblk-alloc.png
index 98c8982..c93753c 100644
Binary files a/content/posts/how-GPU-works/images/14-thdblk-alloc.png and b/content/posts/how-GPU-works/images/14-thdblk-alloc.png differ
diff --git a/content/posts/how-GPU-works/images/15-code-ex.png b/content/posts/how-GPU-works/images/15-code-ex.png
index 1187ab9..4a697f1 100644
Binary files a/content/posts/how-GPU-works/images/15-code-ex.png and b/content/posts/how-GPU-works/images/15-code-ex.png differ
diff --git a/content/posts/how-GPU-works/images/16-thread-group.png b/content/posts/how-GPU-works/images/16-thread-group.png
index 252d92c..cd57774 100644
Binary files a/content/posts/how-GPU-works/images/16-thread-group.png and b/content/posts/how-GPU-works/images/16-thread-group.png differ
diff --git a/content/posts/how-GPU-works/images/17-schd-single.png b/content/posts/how-GPU-works/images/17-schd-single.png
index a6781eb..96d50cb 100644
Binary files a/content/posts/how-GPU-works/images/17-schd-single.png and b/content/posts/how-GPU-works/images/17-schd-single.png differ
diff --git a/content/posts/how-GPU-works/images/18-schd-double.png b/content/posts/how-GPU-works/images/18-schd-double.png
index 0e0d79e..6d1a4d4 100644
Binary files a/content/posts/how-GPU-works/images/18-schd-double.png and b/content/posts/how-GPU-works/images/18-schd-double.png differ
diff --git a/content/posts/how-GPU-works/images/19-schd-triple.png b/content/posts/how-GPU-works/images/19-schd-triple.png
index ff2650c..3693ea4 100644
Binary files a/content/posts/how-GPU-works/images/19-schd-triple.png and b/content/posts/how-GPU-works/images/19-schd-triple.png differ
diff --git a/content/posts/how-GPU-works/images/gpu-cuda-logo.png b/content/posts/how-GPU-works/images/gpu-cuda-logo.png
index e2cf4d9..7d05d98 100644
Binary files a/content/posts/how-GPU-works/images/gpu-cuda-logo.png and b/content/posts/how-GPU-works/images/gpu-cuda-logo.png differ
diff --git a/content/posts/lpu-deep-dive/gifs/Disaggregated-inference.gif b/content/posts/lpu-deep-dive/gifs/Disaggregated-inference.gif
index 19582c3..5261c6a 100644
Binary files a/content/posts/lpu-deep-dive/gifs/Disaggregated-inference.gif and b/content/posts/lpu-deep-dive/gifs/Disaggregated-inference.gif differ
diff --git a/content/posts/lpu-deep-dive/gifs/gpu_example.gif b/content/posts/lpu-deep-dive/gifs/gpu_example.gif
index ddf3e0f..039fbf7 100644
Binary files a/content/posts/lpu-deep-dive/gifs/gpu_example.gif and b/content/posts/lpu-deep-dive/gifs/gpu_example.gif differ
diff --git a/content/posts/lpu-deep-dive/gifs/lpu_example.gif b/content/posts/lpu-deep-dive/gifs/lpu_example.gif
index 07c95fd..8f5d7e3 100644
Binary files a/content/posts/lpu-deep-dive/gifs/lpu_example.gif and b/content/posts/lpu-deep-dive/gifs/lpu_example.gif differ
diff --git a/content/posts/lpu-deep-dive/gifs/speculative-decoding.gif b/content/posts/lpu-deep-dive/gifs/speculative-decoding.gif
deleted file mode 100644
index f4df7b8..0000000
Binary files a/content/posts/lpu-deep-dive/gifs/speculative-decoding.gif and /dev/null differ
diff --git a/content/posts/lpu-deep-dive/images/agentic_ai.png b/content/posts/lpu-deep-dive/images/agentic_ai.png
index 8413b42..3fe900b 100644
Binary files a/content/posts/lpu-deep-dive/images/agentic_ai.png and b/content/posts/lpu-deep-dive/images/agentic_ai.png differ
diff --git a/content/posts/lpu-deep-dive/images/ai_giga_factory.png b/content/posts/lpu-deep-dive/images/ai_giga_factory.png
index 826a14f..08d43af 100644
Binary files a/content/posts/lpu-deep-dive/images/ai_giga_factory.png and b/content/posts/lpu-deep-dive/images/ai_giga_factory.png differ
diff --git a/content/posts/lpu-deep-dive/images/all_reduce.png b/content/posts/lpu-deep-dive/images/all_reduce.png
index 279d2ef..5f318d6 100644
Binary files a/content/posts/lpu-deep-dive/images/all_reduce.png and b/content/posts/lpu-deep-dive/images/all_reduce.png differ
diff --git a/content/posts/lpu-deep-dive/images/auto_regressive.png b/content/posts/lpu-deep-dive/images/auto_regressive.png
index 77a12d6..303c903 100644
Binary files a/content/posts/lpu-deep-dive/images/auto_regressive.png and b/content/posts/lpu-deep-dive/images/auto_regressive.png differ
diff --git a/content/posts/lpu-deep-dive/images/dram_vs_sram.png b/content/posts/lpu-deep-dive/images/dram_vs_sram.png
index 970ffbd..b616da4 100644
Binary files a/content/posts/lpu-deep-dive/images/dram_vs_sram.png and b/content/posts/lpu-deep-dive/images/dram_vs_sram.png differ
diff --git a/content/posts/lpu-deep-dive/images/gpu_disaggregation.png b/content/posts/lpu-deep-dive/images/gpu_disaggregation.png
index 8143610..0ca4187 100644
Binary files a/content/posts/lpu-deep-dive/images/gpu_disaggregation.png and b/content/posts/lpu-deep-dive/images/gpu_disaggregation.png differ
diff --git a/content/posts/lpu-deep-dive/images/gpu_memory_hierarchy.png b/content/posts/lpu-deep-dive/images/gpu_memory_hierarchy.png
index 3fc1053..9817c34 100644
Binary files a/content/posts/lpu-deep-dive/images/gpu_memory_hierarchy.png and b/content/posts/lpu-deep-dive/images/gpu_memory_hierarchy.png differ
diff --git a/content/posts/lpu-deep-dive/images/gpu_sync.jpg b/content/posts/lpu-deep-dive/images/gpu_sync.jpg
index d19072d..6928011 100644
Binary files a/content/posts/lpu-deep-dive/images/gpu_sync.jpg and b/content/posts/lpu-deep-dive/images/gpu_sync.jpg differ
diff --git a/content/posts/lpu-deep-dive/images/groq_logo.jpg b/content/posts/lpu-deep-dive/images/groq_logo.jpg
index cc4653b..00a626b 100644
Binary files a/content/posts/lpu-deep-dive/images/groq_logo.jpg and b/content/posts/lpu-deep-dive/images/groq_logo.jpg differ
diff --git a/content/posts/lpu-deep-dive/images/remove_ctrl_logic.png b/content/posts/lpu-deep-dive/images/remove_ctrl_logic.png
index 22c2740..0d37f93 100644
Binary files a/content/posts/lpu-deep-dive/images/remove_ctrl_logic.png and b/content/posts/lpu-deep-dive/images/remove_ctrl_logic.png differ
diff --git a/content/posts/lpu-deep-dive/images/roofline_comparison.png b/content/posts/lpu-deep-dive/images/roofline_comparison.png
index ed5615b..759fb98 100644
Binary files a/content/posts/lpu-deep-dive/images/roofline_comparison.png and b/content/posts/lpu-deep-dive/images/roofline_comparison.png differ
diff --git a/content/posts/lpu-deep-dive/images/roofline_concept.jpg b/content/posts/lpu-deep-dive/images/roofline_concept.jpg
index a21e92f..6480c98 100644
Binary files a/content/posts/lpu-deep-dive/images/roofline_concept.jpg and b/content/posts/lpu-deep-dive/images/roofline_concept.jpg differ
diff --git a/content/posts/lpu-deep-dive/images/rubin_cpx_platform.png b/content/posts/lpu-deep-dive/images/rubin_cpx_platform.png
index d41fa2d..7db3794 100644
Binary files a/content/posts/lpu-deep-dive/images/rubin_cpx_platform.png and b/content/posts/lpu-deep-dive/images/rubin_cpx_platform.png differ
diff --git a/content/posts/lpu-deep-dive/images/software_defined_hardware.png b/content/posts/lpu-deep-dive/images/software_defined_hardware.png
index c57919a..1d7294d 100644
Binary files a/content/posts/lpu-deep-dive/images/software_defined_hardware.png and b/content/posts/lpu-deep-dive/images/software_defined_hardware.png differ
diff --git a/content/posts/lpu-deep-dive/images/speculative-decoding-workflow.jpg b/content/posts/lpu-deep-dive/images/speculative-decoding-workflow.jpg
new file mode 100644
index 0000000..b32e678
Binary files /dev/null and b/content/posts/lpu-deep-dive/images/speculative-decoding-workflow.jpg differ
diff --git a/content/posts/lpu-deep-dive/images/tp&pp.jpg b/content/posts/lpu-deep-dive/images/tp&pp.jpg
index b45643a..82d782f 100644
Binary files a/content/posts/lpu-deep-dive/images/tp&pp.jpg and b/content/posts/lpu-deep-dive/images/tp&pp.jpg differ
diff --git a/content/posts/lpu-deep-dive/images/training_vs_inference.png b/content/posts/lpu-deep-dive/images/training_vs_inference.png
index eaceb65..7036b09 100644
Binary files a/content/posts/lpu-deep-dive/images/training_vs_inference.png and b/content/posts/lpu-deep-dive/images/training_vs_inference.png differ
diff --git a/content/posts/lpu-deep-dive/images/warp_scheduling.png b/content/posts/lpu-deep-dive/images/warp_scheduling.png
index 9b66c95..a39e610 100644
Binary files a/content/posts/lpu-deep-dive/images/warp_scheduling.png and b/content/posts/lpu-deep-dive/images/warp_scheduling.png differ
diff --git a/content/posts/lpu-deep-dive/index.en.md b/content/posts/lpu-deep-dive/index.en.md
index c146adb..4d751d9 100644
--- a/content/posts/lpu-deep-dive/index.en.md
+++ b/content/posts/lpu-deep-dive/index.en.md
@@ -235,7 +235,7 @@ Then what tasks can LPU proceed faster with? From the perspective of LLM inferen
**Speculative Decoding**
-
+
One recent trend in LLM serving is **Speculative Decoding**. As model sizes grow and computation time becomes longer, a small and fast model (**Draft Model**) that distills or is trained to behave similarly to the existing model (**Target Model**) quickly generates the latter part of a sentence in advance, then the Target Model verifies this in parallel. Groq's LPU clusters can be used for small-sized Draft Model computation here. This is because LPU boasts overwhelming token generation speed in small-sized models. From an overall perspective, the roles of LPU/GPU clusters can be divided as follows:
diff --git a/content/posts/lpu-deep-dive/index.ko.md b/content/posts/lpu-deep-dive/index.ko.md
index 035ce90..fba704a 100644
--- a/content/posts/lpu-deep-dive/index.ko.md
+++ b/content/posts/lpu-deep-dive/index.ko.md
@@ -236,7 +236,7 @@ LPU의 정적 스케줄링(static scheduling)은 바로 딥러닝과 LLM, 그중
**Speculative Decoding**
-
+
최근 LLM 서빙의 트렌드 중 하나는 **Speculative Decoding**(추측 디코딩)입니다. 모델 사이즈가 커지면서 연산 시간이 오래 걸리다보니 기존 모델(**Target Model**)을 증류하거나 비슷한 동작을 하도록 훈련된 작고 빠른 모델(**Draft Model**)이 문장의 뒷부분을 미리 빠르게 생성하면, Target Model이 이를 병렬로 검증하는 방식입니다. 그록의 LPU 클러스터는 여기서 작은 사이즈의 Draft Model 연산에 사용될 수 있습니다. LPU는 작은 사이즈의 모델에서 압도적인 토큰 생성 속도를 자랑하기 때문입니다. 전체적인 관점에서 LPU/GPU 클러스터의 역할을 구분해보면 아래와 같습니다.