Hacker Newsnew | past | comments | ask | show | jobs | submit | chiragrohit's commentslogin

How many tokens are you burning daily?


The real cost driver with agents seems to be the repetitive context transmission since you re-send the history every step. I found I had to implement tiered model routing or prompt caching just to make the unit economics work.


Not the OP but I think in case of scanning and tagging/summarization you can run a local LLM and it will work with a good enough accuracy for this case.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: