Note on GitHub - Exo-Explore/Exo: Run Your Own AI Cluster at Home With Everyday Devices π±π» π₯οΈβ via GitHub
exo optimally splits up models based on the current network topology and device resources available. This enables you to run larger models than you would be able to on any single device.
The embeddings for Llama-3-8B are around 8KB-10KB. For Llama-3-70B theyβre around 32KB. These are small enough to send around between devices on a local network.
This kind of swarm compute is so cool and should be more common. Definitely gets us closer to frugal and salvage computing and permacomputing.
AIHorde is another example (but using peer compute).
Reference
- Notes
- composability, llm, network-theory
- GitHub - Exo-Explore/Exo: Run Your Own AI Cluster at Home With Everyday Devices π±π» π₯οΈβ
- 
        Permalink to 2024.NTE.125
- Insight
- Edit
| ← Previous | Next → | 
| Note on Epistemic Calibration and Searching the Space of Truth via thesephist.com | Note on The Folly of Certainty via OaktreeCap |