Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
a1bd11a
my initial comment
KendalCoder Jun 9, 2025
93a765d
Update README.md
KendalCoder Jun 9, 2025
fc060a6
Restart: preparing to update README with correct instructions
KendalCoder Jun 9, 2025
76d39d8
Update README with correct instructions
KendalCoder Jun 9, 2025
eb594de
todays task
KendalCoder Jun 10, 2025
11631e3
daily task
KendalCoder Jun 10, 2025
a53969a
initial code
KendalCoder Jun 10, 2025
69ba990
initial code for main
KendalCoder Jun 10, 2025
83ad4a9
Update task_A.py
KendalCoder Jun 10, 2025
88abbf0
Update task_B.py
KendalCoder Jun 10, 2025
3f84670
Update task_C.py
KendalCoder Jun 10, 2025
eabd0c5
Update main.py
KendalCoder Jun 10, 2025
d333b10
day 2
KendalCoder Jun 11, 2025
e2c91ef
add
KendalCoder Jun 12, 2025
dad4942
day 3
KendalCoder Jun 12, 2025
38533f5
code upload
KendalCoder Jun 12, 2025
4d87bb5
new changes
KendalCoder Jun 13, 2025
82c98dc
new code
KendalCoder Jun 13, 2025
6913b82
Final changges for code 2
KendalCoder Jun 13, 2025
6bb2e43
for new code
KendalCoder Jun 13, 2025
591af42
changes end of week
KendalCoder Jun 13, 2025
cc688a5
end of week changes
KendalCoder Jun 13, 2025
2854fe8
Update Week 3
KendalCoder Jun 20, 2025
0abfb57
week 4 daily blog
Jun 27, 2025
bdfb34c
week 4 daily blog
Jun 27, 2025
56aa858
Add weekly progress logs for Week 5 and Week 6, including metrics and…
Jul 7, 2025
28fd5c3
week 6
Jul 14, 2025
4b7cf20
Fix tensorboard.py and finalize results for journal; update visual pr…
Jul 14, 2025
1433d59
start week 7 journal
Jul 14, 2025
f71a0e9
Enhance Week 7 journal with finalized metrics, output logs, and insig…
Jul 31, 2025
c6be32b
Add Week 8 journal with main goals and completed tasks
Aug 6, 2025
fe5b137
Add Week 9 journal with main goals, completed tasks, and documentatio…
Aug 6, 2025
4788830
Add Week 10 journal with main goals and planned tasks for final docum…
Aug 8, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 75 additions & 0 deletions Cromer/Daily Blog/Week 2
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
Week 2
Tuesday June 10th

- Created several python files to start a simple job scheduler system
- created the folder with 5 initial files
- started working on starter code for the scheduler file and task

Wed June 11, 2025

- Attended AI + Edge Computing Lecture
- * Prepared notes for Thursday’s meeting:
- Initial prototype with `scheduler.py`, `task.py`, `main.py`, and added `node.py`
- Planning logging visibility (similar to `kube-scheduler-simulator`)
- 'Modular scheduler design with plugin-like architecture: filter, score, reserve, bind
- Researched to implementand improve:
- Kubernetes plugin scheduler framework
- YAML-based input simulations (pods/nodes)
- Event logging best practices

- Practiced Python scheduling logic
- Explored GitHub repo: [kube-scheduler-simulator](https://github.com/kubernetes-sigs/kube-scheduler-simulator)

- Start problem statement,motivation , and hypothesis for presentation

Thursday June 12, 2025

- **Tasks Completed:

- Met with the writing coach to discuss how to present and explain technical progress more clearly.
- Continued reviewing how to improve the project code based on my research into scheduling simulations and Kubernetes schedulers.
- Met with the team at 3 PM for regular project sync.
- Met with Yongho after the team meeting to review the code I’ve written so far and discuss the next steps.

- Code Progress:

- Began implementing a new Node class (node.py) to simulate compute nodes (rpi, jetson, blade) for more realistic task execution modeling.
- Planning to use this class to:
- Simulate real node states
- Monitor resource usage
- Implement distributed task execution logic
- Reset system state for multiple test runs
- Inject simulated workloads
- *Adjusting the scheduler.py file so that it delegates execution to nodes, not the scheduler directly.

Friday June 13, 2025
- In Progress: from the day before
- Continuing to explore modular scheduling based on plugin-style patterns inspired byKu bernetes (Filter, Score, Bind phases).
- Preparing to integrate logging and metrics tracking for node performance and task flow.
- Added functionality to:
- Print how many times each task was executed
- Track how many tasks each node handled
- Log which node executed which task

Meeting
- Met with Yongho for progress check-in.

Code Improvements
- ✅ Fixed initial import/module errors in scheduler setup.
- ✅ Enhanced scheduler system to:
1. Track how many times each task was executed.
2. Introduce a `Node` class to simulate compute nodes: `rpi`, `jetson`, and `blade`.
3. Assign tasks to nodes dynamically — nodes now execute the tasks, not the scheduler.
4. Log which node executes which task.
5. Add execution summaries grouped by both task and node.

Study / Research
- Read GitHub repositories related to **edge resource analysis** to explore real-world scheduler designs.
- Watched tutorials/videos to better understand edge computing task allocation patterns.

Next Steps
- Expand node simulation to include resource capacity and load balancing.
- Look into logging task failures or retries.
- Possibly add threading to simulate parallel execution across nodes.


25 changes: 25 additions & 0 deletions Cromer/Daily Blog/Week 3
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
Weekly Progress Log

Monday June 16,2025
• Watched Python tutorial videos to strengthen scripting skills.
• Conducted research on scheduler simulators for distributed systems.
• Began reviewing relevant GitHub repositories related to task scheduling and edge computing.

Tuesday June 17,2025
• Continued reading and analyzing GitHub repositories.
• Focused on understanding the logic and structure behind existing scheduler implementations.
• Attended Dr. Wilkie’s CS seminar on High Performance Computing (HPC), gaining insights into large-scale task scheduling and system efficiency.

Wednesday June 18,2025
• Participated in two academic seminars.
• Deepened understanding of repository logic and scheduling strategies through hands-on code reading.

Thursday June 19,2025
• Met with mentor in the morning to discuss project goals and roadmap.
• Set up Linux development environment on a laptop for code testing.
• Successfully ran the wiggle scheduler code; identified an error in an alternate scheduler implementation.
• Met with the project team in the afternoon for collaboration and planning.
Friday June 20, 2025
• Revisited the scheduler code to reinforce understanding of its logic and structure.
• Took detailed notes on areas for improvement and potential refactoring.
• Prepared for the upcoming week by outlining next steps and identifying key focus areas.
117 changes: 117 additions & 0 deletions Cromer/Daily Blog/Week 4
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@

Weekly Log: Scheduler Integration and Optimization (Week of June 24)


Weekly deliverables

Task 1:
Run and test simulation using Quick Start instructions

Task 2-1:
Uncomment scheduler in mywaggle.yaml

Task 2-2:
Fix bug in FairshareScheduler and demonstrate it running in simulation

Task 3:
Integrate Henry’s dual descent scheduler from Colab link into the scheduler/ directory and run it with simulation



Monday

• Met with Yongho → Received this week’s deliverables

• Completed:
• Passport Safety Training
• Section 101 Security Training
• Additional safety module
• Started:

• Task 1: README.md Quick Start
• Ran, tested, and debugged simulation code
• Investigated bug in FairshareScheduler



Tuesday

• Worked on research journal: methodology & design notes
Focused on:

• Fixing Fairshare bug:
error line 10

# Old (error)
cpu_usage = [node.metrics["cpu"] for node in nodes]

# Fixed:
cpu_usage = [node.metrics.get("cpu", float(”inf”)) for node in nodes]


Had to fix errors to make code run :
• Installed missing packages (tensorboard)
eea971c8f87b5cf2e53c94398ccd6f773a16adfe
0080f1099258f363167f8bdc8ded18c1defcdcaf

• Used --config flag to avoid manually selecting the YAML file every run:

--config /home/kendal/edge_resource_analysis/simulation/kubernetes/mywaggle.yaml


Completed Task 2-2 → Working demonstration of FairshareScheduler with debugged logic




Wednesday

• Moved Henry’s dual descent code into the scheduler/ directory (dual_node.py)

• Installed cvxpy:

Bash: pip install cvxpy

• Fixed henry code for the dualnode.py :
46b2a7c7a65af8114c42dca50a599d0c06d85433

• Learned:

• cvxpy enables mathematical optimization in Python using variables, objectives, constraints

• Attended a seminar (HPC focus)

• Wrote prep notes for Thursday code explanation for the meeting



Thursday

• Morning: Met with mentor to review simulation logic and scheduler goals

• Afternoon: Met with team

• Reached out to Work with Henry to:

• Clarify lambda format

• Continue testing dual descent scheduler

• Make centralized_solver.py runnable with simulation

• Understand and get clarification on the output of the code match his that he ran and mine




Friday

• Completed Task 3: Integrated and ran Henry’s scheduler in the simulation

• Verified: It integrates with run.py
• CVXPY logic solves the problem and dual update works
• Reviewed SAGE documentation for formatting results:
https://www.anl.gov/mcs/Sage-MSRI-1

• Started working on next steps for my journal

88 changes: 88 additions & 0 deletions Cromer/Daily Blog/Week5.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
Weekly Progress Log – Scheduler Evaluation & Visualization (July 1–5, 2025)

Weekly Focus:

Extracting meaningful metrics from TensorBoard
Comparing performance between WaggleScheduler and FairshareScheduler
Fixing energy consumption reporting bugs
Building visualizations for clarity
Adapting Henry’s dual descent scheduler into simulation




Goals for this week


Task 1
Convert TensorBoard logs into CSV for better metric visibility

Task 2
Add per-node statistics: mean energy, workload count, task time

Task 3
Compare all metrics across different schedulers

Fixes
Address bug where TensorBoard showed energy=0 despite CPU usage

Enhancement
Add support for visualizing and identifying scheduler results

Research
Integrate Henrys dual_descent into new scheduler module


Monday

🛠 Bugfix: Fixed issue where TensorBoard reported 0 energy usage even with non-zero CPU.
subtask : bug fix 26c24d2a7f13a7f0fafd66713ecb27390a0f12b0

🔧 Added new metrics lines to __init__() for energy tracking.
✅ Simulation now records meaningful energy data.
🧪 Verified both WaggleScheduler and FairshareScheduler logs generate usable CSVs.




Tuesday

🤝 Met with Henry to discuss dual descent scheduler integration.
📈 Clarified expected output from dual_descent:
A matrix like:
[[0, 1], [1, 0], [0, 1]] → Each row = workload, column = node.
🧠 Brainstormed how to pass simulation state to dual descent function.
🧰 Began adapting simulation run to accept new dual-descent placement matrix.




Wednesday
Met with Yongho
✅ Updated run.py line 103 to include total energy in TensorBoard logs.
📊 TensorBoard now outputs energy correctly.
🧠 Brainstormed visual identification for scheduler logs:
Current challenge: All logs look similar (over 30+), hard to trace which run = which scheduler.
Solution: Need clear naming/labeling for scheduler in TensorBoard or CSV.



Thursday

🎨 Finished first version of visualization code:
Plots energy, task count, execution time per node.

✅ Task 2 completed: all metrics per node calculated.
✅ Task 3 attempted:
CSV output works
❌ Error: [ERROR] No scheduler data to compare.
Suspected cause: TensorBoard logs don’t differentiate waggle vs fairshare


🔧 Fixed file path bugs that caused directory mismatch errors earlier.
🧪 Now receives output files with task metrics per scheduler — working toward full comparison.

my new working code
f5bbbf49b94c5a4feaf7fbb2faf3b11c91e38df7


Loading