So I did some research and found a way to access article body text without having to use diffbot so that'll be fun. I found a python wrapper for boilerpipe [1] so I plan on redoing this analysis. The amount of data I had to work with was pathetic. This time around I'll utilize
jv22222 's suggestion on how to get the hyperlinks out of the tweets too. Thanks all! This was fun :)
you'll also notice some editorializing the topics (venture -> venture capital)
the lda features are already grouped, that's exactly what an LDA does, however, translating a group of words into a "summary" (whatever that is) is non trivial. You'd find need to define what you're looking for. A visual summary for example might for example be a word cloud, another might be the use of ontology tagging if you consider those a salient summary.
The rest of it is pretty bush league.