Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Is there really a market for deep learning skills without a Ph.D?
171 points by meow_mix on Jan 24, 2017 | hide | past | favorite | 58 comments
Earlier I was intrigued by an crash-course posted on here titled "Learn Tensorflow and deep learning without a Ph.D", linking to a GCP page here: https://cloud.google.com/blog/big-data/2017/01/learn-tensorflow-and-deep-learning-without-a-phd

Would those offering jobs related to deep learning really be comfortable offering the a position building / using these kinds of models without an advanced degree?

Those who have gotten a job in deep learning / machine learning without an advanced degree, could you share your experience?



Yes! Deep learning is a young field. I have a B.S., for example, and have published work at Interspeech (speech recognition conference) and NIPS workshop (machine learning conference). You can too.

My advice:

  * 80% of machine learning is software engineering. If you're a strong software engineer, with some good math foundations (calculus, probability, matrix algebra), you can make significant contributions.

  * For deep learning, you can start with the crash course, but you'll eventually want to read a full textbook. If you read the new Goodfellow, Bengio, and Courville textbook and understand the details of the algorithms, then you'll be up to speed with the major ideas in deep learning today: http://www.deeplearningbook.org/

  * Find a company with existing deep learning experts who need software help -- that's most companies applying deep learning.

  * Read research papers. Some will be easy to read, some will be hard. If they're hard, read the references they cite. Eventually they'll get easier.
Let me know if you need help!


For those interested, here is the relevant Interspeech paper[1]. The NIPS paper seems to be not published yet.

[1] https://static.googleusercontent.com/media/research.google.c...


The NIPS workshop was Machine Learning for Health. The paper isn't out as a PDF since a more mature version of the work will be submitted to a medical journal: http://nipsml4hc.ws


  * 80% of machine learning is software engineering. If you're a strong software engineer, with some good math foundations (calculus, probability, matrix algebra), you can make significant contributions.
Do you have some examples of this? I'd love to have a look at some real world ones.


Not sure if this is what you're looking for, but Andrew Ng touched on a lot of pragmatic parts of software engineering in machine learning teams at Deep Learning School: https://www.youtube.com/watch?v=F1ka6a13S9I&t=3380s


Not sure what example you want. If you are talking about code, take a look at the torch source code. It is pretty accessible.


scikit-learn is a poster child for best practices in machine learning engineering.


You yes you!! Your paper have lot of fancy math. It will kill a software engineer like me. How you did that?


The math is more approachable than it may seem. My recommendation is to read textbooks (not online tutorials or crash courses), which will guide you through the mathematical concepts and notation. Each little bit builds on the previous one, so if you read a few pages a day, every day, then after a few months, then concepts that once seemed very fancy will start to become more intuitive since you've seen the parts arranged in a different way before.


Can I PM you?


Sure, my email is brandon@cardiogr.am.


I am a co-founder of deep learning startup - https://tensorflight.com. Engineers without PhD, but some deep learning experience, are useful on applied deep learning teams like ours. Please get in touch at kozikow@tensorflight.com.

Training model itself takes maybe 10-20% of overall engineering effort. You need to have at least one person on the team, who grinded their teeth on training deep learning models. My co-founder says at least half a year of full time experience. I would say that there is nothing that would fundamentally require the PhD, just that very few people without the PhD have the sufficient experience right now. What's the most important is an intuition what method will help the most in your situation.

Within model training there are many tasks that can be off-loaded to someone without heavy deep learning expertise - e.g. data augmentation or evaluation statistics. Someone with deep learning experience is still required to know which tasks will have the highest impact.

Except the model training there is lots of work on everything around it. Inference pipeline, web server, data gathering, hardware. Majority is "plain engineering", but there is a few engineering skills specific to deep learning - e.g. inference pipeline or hardware to train models. What's more, even within the domain like computer vision, "classic" methods are sometimes still more cost effective than deep learning.


>What's the most important is an intuition what method will help the most in your situation.

A PhD doesn't teach you a method, it teaches you this.

That intuition comes from years of specialized training that no crash-course or bootcamp can teach.


"deep learning bootcamp" wouldn't be sufficient.

On the other hand, training deep learning models was very different world two years ago. From a few examples in the field I know of, someone reasonably talented with decent math fundamentals can catch up with the state of the art in 6-12 months.


>someone reasonably talented with decent math fundamentals can catch up with the state of the art in 6-12 months.

Being able to parrot what is known, is far far far different from being able to ask what comes next.

I am talking about the training it takes to develop a critical and creative mind that can ask and answer novel questions.


cut their teeth!

I'm surprised there are hardly any PhDs specifically in "deep learning", given the youth of the term. Would a PhD in some related area of statistics count for something?


My background is that I did my undergrad at JHU, then did applied Machine Learning research for two years at Amazon, and now I'm a PhD student in ML/DL. I was very into ML during undergrad, so overall I have ~7 years of experience.

My view is that Machine Learning is a deep field which could take decades to fully appreciate, but which is also easily accessible, especially if one is interested in building applications.

I think that a good litmus test for ML expertise is asking someone what % of all NIPS papers from last year they'd be able to understand well enough to reproduce after a quick reading.

https://nips.cc/Conferences/2016/AcceptedPapers

Even though I have ~7 years of experience doing ML, I would say that I could probably only fully appreciate ~10% of all NIPS papers.


How difficult is it to get into a position at a top company where you can do ML research, without having a PhD?

And I'm also wondering about going back to university: Is it difficult to get into a good ML PhD program if you've been in the industry for some years?


Very difficult, at east for research. Top companies with ML research labs (FB, Google, Uber) are basically externalized academic labs. The heads of these labs are simply bringing talent over from their former departments (NYU, Cambridge/UBC, CMU).

It is not impossible to work at these labs as an engineer without a PhD, but I don't have a deep understanding of these roles.


Hmm. even though "understand well enough to reproduce" != "fully appreciate", it seems like you're saying that being able to reproduce 10% of the papers is a reasonable benchmark for 7 years of experience. Is that what you meant?


I've hired a person without even a bachelors. They competed in Kaggle competition relevant to what we were working on, and were one of the top competitors. I reached out, hired, and it worked out well for all sides. Eventually, they left to do a startup. But we got our money's worth, and they got a decent salary for a while, and a decent spot on their resume.

I don't think the emphasis should be on Ph.D so much as on demonstrating your competence. A thesis is a way to do that. Kaggle is another. A startup is another. So is doing some interesting crunching on data sets and writing blog posts or articles. Another way is to work up from Analyst to Data Scientist to ML (or other jobs -- you might be able to start in ML at a startup at a low salary).

But if you have no track record, you'll need to get one somehow. A single course degree isn't sufficient. That's 1/32nd of a Bachelor's degree, 1/8th of a Master's, or 1/25th of a Ph.D.


If you hit Ph.D. level, courses are an afterthought, and no longer a meaningful measure of the progress on your degree.

The situation reminds me of the first dotcom bubble, when folks were getting hired to write web apps with little formal training. I can only imagine the technical debt that is accumulating right now in the industry.


> folks were getting hired to write web apps with little formal training

Yes, well, at the time, most people with formal training wouldn't go near a web app project with a ten foot pole ("it'll all end in tears", they said). And old-school Unix hackers versed in Perl weren't necessarily any more likely to have a related degree.

> I can only imagine the technical debt that is accumulating right now in the industry.

Don't worry about it. Most technical debt will get wiped out with the failure of the company (usually for reasons unrelated to technical debt). Companies that survive will do so despite the technical debt, and will have the resources to rewrite things (hopefully avoiding Second System Syndrome).

OTOH, I can only imagine the future pain that low quality DL software patents are going to cause when IP from failed companies gets sold off.


Would you consider someone with a physics and math degree, and some online ML course experience differently than someone with a degree in, say, CS?


Machine learning in most shops is in desperate need of better engineering practices. PhDs are brilliant but they tend to only apply that brilliance narrowly. They often don't really care about things like re-usability, performance, testing, monitoring, or even version control. There is large demand for 'model engineering' to basically help build pipelines to get them data and to create reusable frameworks so models can be iterated on and validated and deployed quickly and easily. It's easy to say the new model has a better AOC than the old one but the true test is whether the model in front of users actually does what it's supposed to do.


My answer is yes. I've worked in companies which have many engineers who work with deep learning. No PhD needed.

There is a difference between doing deep learning research and building a product powered by deep learning. (with some amount of correlation of the respective success in both categories depending on the possession of a PhD). In my experience, the engineers are far, far better at building a product which can create value in the market place. Deep learning algorithms / architectures cannot do this alone. A product encapsulates a user experience which is often completely separated from the particular learner powering the experience. However, engineers without the understanding of basic ML practices (which apply more generally) cannot build great products. (they tend to violate ML theory, i.e. they make dirty data or draw causation where there is correlation). You can see why Google is putting all of their engineers through a 6 month ML course.


A Ph.D. is about publishing papers in top journals and managing academic drama. If neither of those is your end goal, then if you can demonstrate an ability to bring value with ML then a Ph.D. is definitely not required (maybe half the ML group at our company has PhDs).


I would echo another comment here and say that it depends. However (speaking from my own perspective) depending on the discipline there's no specific need to do a PhD in order to understand and apply machine learning.

It also depends on what you mean by "advanced degree". I have an undergraduate degree in physics/math and a master's in economics/finance, and I find between those two things I've been able to follow developments in machine learning and also to apply them to my work. In fact, I used to get a bit annoyed with those who would imply that I "must" do a PhD ... I would say that's certainly true if I wanted to invent new estimators etc., but otherwise not so much. A PhD can be great for other reasons but it's not the sort of thing required for actually doing my job.


I've had to build some relatively simple deep learning systems around tensorflow, and I don't have any special training other than the usual engineering math and statistics.

My observation is that it's much more important to be clever with identifying possible inputs to train on rather than focusing too much on the machine learning itself. A crappy ML implementation that was trained on 20 data sets which are highly relevant and well curated does better than a highly tuned ML system with half the inputs and bad outliers.


A typical answer to any somewhat complicated question: it depends.

If you will find use cases for your newly obtained knowledge the you will be able to secure a job. It's all about practicality. You most likely will get hired to solve a specific problem and if you will be able to market yourself and showcase how your skills can boost revenues, you'll be okay.

That's where "domain-diverse" mindset is important. I don't think Ph.D is required anywhere outside of R&D. Ph.D is certainly a plus, but not a requirement.


My 2 cents 1. You don't need a Ph.D to build a product that uses DL/ML. Some part of your work here could be to define the problem and to understand how to phrase the problem so that it is solvable (if its too easy, your moat won't be technological). Your contribution here could be applying deep learning to new applications. Specifically for applying DL in industry, someone with ability to quickly tryout a lot of things is a good thing to have (with some amount of self-discipline). 2. You don't need a Ph.D to be part of a team that is taking on a hard DL/ML problem. You will hopefully have leaders who can set the directions. 3. Ph.D like most degrees is a label. If you can develop the skills of a good researcher (like any other craftsman), learning to comprehend research ideas quickly, keeping tab of interesting ideas and recent progress, then one could easily find connections and even publish good papers. 4. Now if you really want to understand what happens deep down, why does DL work, to understand how it could be viewed as Tensor Decomposition or coming up with new mathematical optimization or drawing a new connection between sub-areas: then having dedicated time to build those skill-sets (aka Ph.D) is very helpful.


I'm working in the field (and winning contests) without a bachelor. As was said here [0]

"Overall, machine learning systems can be thought of as a machine learning core — usually an advanced algorithm which requires a few chapters from Ian’s book to understand — surrounded by a huge amount of software engineering."

[0] https://blog.gregbrockman.com/define-cto-openai

So that's what I mostly do - write code, try new ideas :) Surely, I had to refresh my knowledge in some areas like probabilities, and have a general understanding how math works, but anyone can do that.


Many have gotten research engineer jobs (vs. research scientist). Check out the profiles of research engineers at places like Apple, Facebook, and OpenAI. Knowing deep learning is an advantage in these roles even if the job is primarily software engineering (not research / science).


Definitely. I have a high-paying data science job at a major corporation with only a 4 year degree in economics and got it after 2 years in the field working analyst jobs. I took stats classes with my free electives, got database skills from my first job and taught myself how to code along the way. Many of my peers have advanced degrees and I can hang no problem. I will say though, that the data science boot camps are not the answer. The applicants we receive from them are generally weak and we have never hired any of them.


you said none of the bootcamp people that applied were hired. out of those bootcamp people, how many had bachelors, and what was the distribution of their degrees? i.e. how many had degrees in CS, Math, Physics, Biology, etc. Alternatively, out of the math degrees that applied (regardless of bootcamp), how many were hired?


Of the bootcamp people, everyone has had at least a bachelors, maybe weighted towards less stem but that's certainly not the rule. I don't think I have a good numbers backed answer for you, but I also don't think prior academics is the issue. It's mostly peoples ability to problem solve. We have no shortage of applicants but it still takes us forever to find people that can do the job.


You mentioned people's "ability to problem solve", and that it "takes forever to find people that can do the job". This is very interesting. How do you measure "problem solving ability" in your applicant pool when all you have are resumes? Do you simply look for points on the resume which reflect positive improvement? Do you have more than just a resume to base your judgement off of?

E.g. consider these two similar points on a resume: "improved loading times tenfold by using compound indexes and denormalization" would reflect an improvement from a problematic situation to a better situation. Whereas "designed schema and implemented indexed database" does not explicitly name a problem and its solution.

I hope my point makes sense, and I am wondering if little differences in the wording make a big impact in how you perceive someone's ability to identify and solve problems.


We give a data set and ask some open ended questions. Out of 800 applicants, maybe 50 complete the exercise, and then maybe less than 10 we'll want to talk to based on what they sent. I spent maybe 5 hours on the dataset when I was interviewing for the position myself.


Wow, only 6% complete the task? I am testing my luck here, but do you have any sample/link to a data-set or questions you have asked before, or maybe a link to something similar to what you have asked?


The dataset is private but the general gist is its like 100,000 rows with conversion and revenue data with time and 2 other dimensions. There are some outliers (clearly errors) in the data and strong seasonality differences in some dimensions, and data sparsity in others. The applicant isn't told about these, but we look for them to find it in their analysis and control for it in a predictive model they are asked to build based on the data.


In real life, having a Ph.D. means you might find, test and put forward a new and original case for your industry by using deep learning or whatever new tool. Without a Ph.D. and the burden of originality, the classical route stands: apply wherever you can, prepare for interviews, await response until you get an invitation and hopefully an offer.


There's always going to be a market for results. Better results, better recommendations, predictions and relevancy. These are the battle lines. Advance these and you're on the team, at least my team. Nothing beats side-by-side comparisons despite the reputations industry or academia would like to safeguard.


Ask yourself this: Is there a market for database developers that don't have a PH.D in Relational Algebra? "AI" will be democratized very quickly just like data (SQL, Excel, etc.) has and solving business problems with technology will be more important than ivory tower tinkering. That being said, if you're interested in Deep Learning... do yourself the favor of truly understanding what is happening with backpropogation and eventually learn to code networks by hand, from memory; simply to aid your own intuition.


> "AI" will be democratized very quickly...

Honestly, I think we're already there, with frameworks like TensorFlow/Keras, Torch, and others.

As a software engineer with 25+ years of experience, these frameworks take a ton of the pain out of writing ML applications (particularly neural networks, which is where they are mainly focused). They also make it easy to integrate GPU and other multi-core-based training into the mix with almost zero effort.


Having good working knowledge in numerical programming, signal processing, statistics etc. transfers relatively well into data science and deep learning. But if you see graph like this: http://imgur.com/a/bxR2x and don't immediately recognize what kind of process generates it, you will find it difficult to understand and analyze why things work and why they don't work. Ability to read difficult research paper and reproduce and implement it might be good test for your skills if you doubt them.

In most data science/mining/analytics companies there are PhD's working as chief scientist, senior analytical lead etc. Below that there can be junior analytic positions that are more programming oriented, but they require good mathematical backround. There are also junior positions for programmers who do mostly programming and as a part of the team. In larger companies there are senior engineer positions that concentrate on numerical programming and implementing company specific algorithms etc. Understanding the terminology and software is very valuable and helps getting into these positions (software developer, engineer) even without deep domain knowledge. Someone who knows ins and outs of low level graphics and game programming might be valuable asset and his knowledge might transfer.


Is the graph specific to deep learning, or do you just mean something like a bimodal distribution or some other statistics concept?


It's the probability distribution of oscillating function (sine wave) with some noise. It's just one test to see if one has working knowledge.

Recognizing distributions you have seen in the book is book knowledge. Seeing a distribution and being able to mentally see what kind of thing it might be when drawn into x,y-axis and figure it out is working knowledge.


The distribution of sin(X) depends on that of X. It can even be made to look Gaussian (well other than the tail).

Your post strikes me as annoyingly pretentious.


I'd assume there are a lot of menial tasks within the field that don't require years of academic experience. Not to mention one can learn on the job and get assigned increasingly complex tasks as the candidate proves his / her ability to learn quickly. Also, the salary of this type of candidate is probably lower in the long run due to the missing credentials and formal education.


Thanks for so many positive responses everyone! Seems like the outlook on this question is brighter than I had anticipated.


Yes! There are many companies (including mine!) who have DL experts but need software engineers. Ping me if interested :)


Could you share your company?


sixsamuraisoldier [at] gmail [dot] com

Yes, we allow remote work. You'll mainly be implementing fast deep learning algorithms such as FFT convolutions, quantization, etc.


Ping. Remote positions?


It depends on what you want to do. Training and testing models with deep learning doesn't require PhD. Designing network architectures, debugging and pushing model to its limits requires significant insights in to theoretical foundations on the other hand. Doing something novel like GANs would require even more formal background. However, I would say anyone with undergraduate math (linear algebra and bit of calculus) can learn all of this if they are determined and willing to invest couple of years of intensive study time. PhD programs just facilates this more easily along with certificate of approval from experts in the field that others can trust about your skill sets.


I have met a lot of companies doing natural language processing, who are looking for BS and MS.


Likely that is actually a filter for is experience in the NLP domain, which you aren't likely to get in most garden-variety SWE positions.


And yes, I published without a PhD as well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: