Hacker Newsnew | past | comments | ask | show | jobs | submit | jtbaker's commentslogin

this would make me so happy!

I know pandas has a lot of technical warts and shortcomings, but I'm grateful for how much it empowered me early in my data/software career, and the API still feels more ergonomic to me due to the years of usage - plus GeoPandas layering on top of it.

Really, prefer DuckDB SQL these days for anything that needs to perform well, and feel like SQL is easier to grok than python code most of the time.


> Really, prefer DuckDB SQL these days for anything that needs to perform well, and feel like SQL is easier to grok than python code most of the time.

I switched to this as well and its mainly because explorations would need to be translated to SQL for production anyways. If I start with pandas I just need to do all the work twice.


chdb's new DataStore API looks really neat (drop in pandas replacement) and exactly how I envisioned a faster pandas could be without sacrificing its ergonomics

I'm not mad about it. Joe seems like a chill dude and is having fun.

The M5 ultra series is supposed to have some big gains around prompt processing - something like 3-4x from what I've read. I'm tempted to swap out my m4 mini that I'm using for this kind of stuff right now!

sung in the voice of Pumbaa

When he was a young botnet!

[1] https://youtu.be/__pNuslNCro


any way to run these via ollama yet?


Incoming force push to rewrite the history . Git doesn't lie!


I wouldn't put it past them...


I wouldn't put it in past tense...


It's very cool! If you want to get higher cache hit rates from a CDN or redis etc. and lower the amount of S3 reads, you can get set up a proxy to convert `/{z}/{x}/{y}.mvt` requests into the byte-range requests: https://docs.protomaps.com/deploy/

Brandon has some example code you can lift to dump it into a Cloudflare Worker or other platforms on that page.


Thank you. I'm going to try this on a different project that we have. Our current deployment is designed to work directly through s3/api gateway which reduces the number of moving parts.

We update the tiles frequently, so the setup has been amazing for us.


you didn't need to read to rewrite to C# to do that - python should be able to handle streaming that amount/velocity of data fine, at least through a native extension like msgspec or pydantic. additionally, you made it much harder for other data engineers that need to maintain/extend the project in the future to do so.


The C# is probably far more maintainable and less error prone than Python. At least in my experience that's almost always the case.

The amount of Python jobs I've had which run fine for several hours and then break with runtime errors, whereas with C# you can be reliably sure that if it starts running it will finish running.


Not a language problem, it's a dev culture problem. You can hold your devs accountable to the quality of their code. Strong er typing support via static analysis as well as runtime validation with untrusted input/data has really helped python alot.

I'm not necessarily the biggest fan of python, but writing a data engineering tool in a non-data engineering focused language seems like a bad decision. Now when the OP leaves the organization is in a much tougher position.


> Now when the OP leaves the organization is in a much tougher position.

Are they really, though? You're assuming their org is unfamiliar with C#. Not all data engineers only know Python. The ones I work with mainly use C# because we all do!


I'm a software and data engineer. I work with C# pretty extensively in my software day job. I've never seen a data engineer job listing mention C#.

Additionally, the way the OP's comment reads, I'm ok with the assumption I made. It reads like it was a unilateral decision on their part and not something that got buy in from the team.


I'm glad they were able to pivot into Astro when Vite won the hot dev server game a few years back.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: