New ask Hacker News story: Ask HN: How to train and host an LLM for internal docs

Ask HN: How to train and host an LLM for internal docs
3 by nborwankar | 1 comments on Hacker News.
I need to investigate scope and cost for training and hosting an LLM to ingest a small to medium size corpus of docs (400 K pages of web text plus PDFs) and train an LLM over it for the purposes of providing a chat like query interface. How do I a) scope and estimate GPU/hrs needed, and b) decide what pre trained transformer model(s) might be best as a starting point?