Josh Beckman (www.joshbeckman.org) Subscribe New comment by bckmn in 'Show HN: Tarsier – Vision utilities for web interaction agents' In Reply To: thread View SourceReminds me of [Language as Intermediate Representation](https://chrisvoncsefalvay.com/posts/lair/) - LLMs are optimized for language, so translate an image into language and they'll do better at modeling it. Josh Beckman Reference Repliesllm, programming-languages, machines, tools, hacker-news 2024, May 15, Wednesday Permalink to 2024.RPY.001 Edit Widgets Comments & Replies on HackerNewsvia email You can subscribe or follow or reply here: Network Graph Legend Keyboard Shortcuts Key Action o Source e Edit i Insight r Random h Home s or / Search Close www.joshbeckman.org/replies/hacker-news-item-40369904