There's a quote I like that I think came from Marvin Minsky, "the US needs a Department of Homeland Arithmetic." We're protecting ourselves from all the wrong risks.
I don't understand why Marvin Minsky hates America so much that he wants to put the government in charge of assessing such risks. The government is an incredibly complex device for distilling a tiny amount of individual ignorance from each voter into a moonshine of batshittery.
Perhaps you have more effect by voting (i.e. switching) TV channels. The public opinion seems to be very important for government policy - and popular shows can be and are seen as a barometer for it, I guess.
Uh, wouldn't this actually be called something like "the US Department of Education"?
It's been a rousing success, too - it's allowed our Ministry of Truth to claim things like "comrades! Our production of students that pass the standardized test is up 85% this quarter!".
It's all about families, man. Good families teach people what's really important - not federal departments of anonymoustaxeatingbureaucrat.
People didn't worry about that (i.e. formal education systems) until only recently, but I guess what's "important" has gotten more complex and the rate of "dysfunction" has increased.
"Here's how that works: imagine that you've got a disease that strikes one in a million people, and a test for the disease that's 99% accurate. You administer the test to a million people, and it will be positive for around 10,000 of them – because for every hundred people, it will be wrong once (that's what 99% accurate means). Yet, statistically, we know that there's only one infected person in the entire sample. That means that your "99% accurate" test is wrong 9,999 times out of 10,000!"
No, it means that the "99% accurate" test is wrong 9,999 times out of 1,000,000. It would be clear to anyone when stated that way. What's counterintuitive is the author's statement of the result, not the result itself.
A better wording is that your chance of having the disease if given a positive result from the test is 1/10000 (0.1%).
That's a huge increase from a 0.0001% chance of having the disease, but it's still not flat out terrifying. Repeat testing can weed through the false positives at a speed proportional to its accuracy.
If the test is picking up something in the person being tested, then yes, you'll get the same result every time and repeated testing proves nothing. But you can still repeat using other tests.
If the test gives false positives purely at random, then repeated testing will help. Say the test is wrong 50% of the time, and you do the test five times. If you get the same results every time, then you can be 100-(50/100)^5*100 = 97% sure of the results.
"But the fact is that attacks by strangers are so rare as to be practically nonexistent. If your child is assaulted, the perpetrator is almost certainly a relative."
Maybe that statement is true in today's world. But for tens of thousands of years, while our brains evolved, I would guess attacks from strangers were a lot more common.
The causation implied here is unlikely at best. More likely might be that until the last century no one had to deal with numbers large enough to need these kinds of statistics.
As someone who deals in diagnostic tests all day, nobody's diagnosed on one test. There's a similar problem I worked out on the odds of getting into medical school, I'm sure this is elementary stats and has someone's name on it, but I'll call it the acceptance problem: how many medical schools do you have to apply to in order to have 90% chance of getting into medical school?
For any school there is a acceptance quotient:
Q = (acceptances sent out)/(number of applications received)
For any given student applying to some schools 1 thru n, the goal is getting at least once acceptance, and applying to more schools, mathematically, can't possibly hurt in the closed case (neglecting social engineering, time spent on applications, etc), so the chance of acceptance, Ca, approaches 1 with every new application in the following fashion:
Similarly, if the sensitivity of a test is 90%, that means the test identifies 9 of every 10 people with the diagnosis. If I administer n different tests each with a sensitivity of S, then the chances of accurately diagnosing the disease, Cd, goes up* with each additional positive but never gets to 1.
So lets say you are doing, say, genetic testing, and any one gene is 1% sensitive for the disease. If you tested 300 genes you could be no more than 95% certain of the diagnosis.
(1 - (1 - .99)^300)100 = 95.09591...%
Now, if your genetic tests were 5% accurate, you're panel could be no more than 95% accurate with 59 tests.
If your test was 50% accurate, you're panel could consist of 6 tests and be no more than 95% accurate.
Of course, if some of the tests are negative, things get more complicated. One of the problems with these data sets is that we have no idea how predictive they are. You can't even calculate the predictive power of the database. There simply haven't been enough events. Then we get into surrogate measures (how many were positive on tests 1 - n and were found to have razor blades in their homes, etc).
The claim that these databases can't be effective isn't true. They could be. P might also equal NP. Whether the hypothesis is strictly true or not, the vague but real set of 'practical concerns' suggest that the truth of the hypothesis is sufficiently difficult to test as to render the null hypothesis the de facto assumption until proven otherwise.
The assumption of independence underlying the math you're doing is probably outright false (albeit mathematically simple). As one example, imagine the case where all n schools use the exact same admissions criteria. Unless you're applying with a random application to each school, your math is shot; you will either get into every school or none of them. I won't even go into the genetic independence issue.
Every mathematical model is a false representation of reality. The predictive accuracy must be validated by experiment. And using the link I reference, you'll see there's actually some pretty strong data to start from in this case.
I believe the worst damage to statistics is specious reasoning. My favourite chocolate promotion is Mars' '1 in 6 Wins a Free Bar' (on currently here in Oz). If I buy 6 bars, most people would assume I would win once. In fact, I have only a 2/3 chance of winning a free one
1 - (5^6/6^6)
Buy 12 bars, and there's still a more than 10% chance I won't have won yet...most of the chocolate-buying government-voting lottery-praying public would be stunned.
Although the average number of bars you'd need to buy before you win is indeed six. I think almost no one would put the odds of actually winning one by buying six at 100%.
Why not? It's a one-in-four (or one-in-three, depending on who you believe) chance of being rich in a few years--that means that in the worst case scenario, if you keep at it, starting over when you fail, in about 10-15 years you're very nearly guaranteed to be rich. And you almost certainly won't starve in the process.
Of course, that assumes a reasonable level of intelligence, education, and drive.
Startups are about the best game going, as far as I can tell--I wouldn't be playing if the game was rigged against me (more than a little, anyway...sure, small companies have higher relative regulatory burden, but on the whole the technology game is actually rigged in favor of new companies, from a growth perspective).
I don't get how you're guaranteed to be rich in 10-15 years. If it's a 1/4 chance of being rich every x years, it's still just 1/4 chance over the span of 10-15 years, right?
However, I'm going to guess that you mean every time you try is going to influence the next time you try for the better (as evidenced by some paper about higher success rates for 2nd+ time entrepreneurs I remember on here), which makes sense. You learn from your mistakes, you make contacts, you have a better view of the market--so it shouldn't stay one in four every time you try.
No, just a mismatch with understanding your wording. With your original wording, it didn't make sense, since each roll would have the same prob, no matter how many times you rolled. That's why I figured you meant that rolls weren't independent from each other.
But it seems that you mean, what's the chance that given x number of rolls, the very last one is a "1" (assuming you only need/want to get rich once). As the number of rolls increase, the chance of that scenerio (a string of non-1s with the last one being 1) becomes smaller and smaller when taken as a whole.
How else could I possibly mean "keep trying for 10-15 years"? One can't take 2-5 year increments of your life in isolation, since, as you've noted, you only need to get rich once to be rich.
The Internet! Where do you get your dubious information from?
Seriously, though, there have been a few studies of various degrees of reliability that indicate that new technology business failure rate over five years is quite a bit smaller than the old "9 in 10 startups fail" wives tale would have us believe. I wouldn't put significant weight on any particular piece of data, but it seems to be pretty consistently in that range whenever people who I would trust to know the numbers (VCs, angels, journalists covering the field, successful and famously unsuccessful entrepreneurs) talk about it.
And, among my own peers here in the valley who started during WFP07, about 1/7 of them are already rich (by some definition of rich). There were 21 groups, and I believe 3 have had exits, and I'm certain that Octopart, Weebly, Buxfer, Heysan, and Virtualmin have not come fully to fruition yet. Tsumobi might even surprise folks, as they're still slaving away in their secret underground lab in the Balkans (Josh may have actually said, "Boston", it was hard to hear at the Startup School reception due to the size of the crowd). YC certainly makes a notable improvement in the outcomes of their startups, but it's not magical, so I don't think it's a crazy idea to look at YC startups as at least somewhat representative of tech startups in general--where "tech startup" means, to me, folks who actually file the paperwork, build something, and get it into the hands of users...until you've done that, you're just another dork with a big idea (and those probably fail at a much higher rate than 9 in 10).
Anyway, we're only a year and a half into the experiment with the WFP07 group, and I expect the numbers will probably end up in the 1/3 to 1/2 range.
Adam has explained to me why their approach was different from Hecl, but I won't attempt to reproduce that explanation, since I don't actually know what Hecl or Tsumobi are all about. I have vague notions, and when I'm actually talking to Adam or Josh about it it all makes beautiful sense...but then they stop talking, the fog returns, and I have no idea what it is they're building.
What I'm saying is, they're really smart guys working in a field that I know almost nothing about, and doing work that walks a razor fine line between "research" and "product". Thus, one of their biggest problems in reaching a market, reaching investors, or reaching developers, is making what they're working on into a concrete solution to a real-world problem that everyone (or at least their customers) can understand quickly. I think they'd be a bargain for anyone that hired them (either by investing in them or acquiring Tsumobi) because they are extremely smart kids with huge ideas, but I'm not sure how many people will see that based on what they're building.
And, while I'm pontificating, I don't think I'd be crazy to suggest that the best thing they could do would be to get their current code into the hands of some customers--even just a few. Because nothing guides you to providing value like having customers. And the more they pay (or the more ownership they have, if it's an Open Source project), the more value they demand...and that's a good thing when it comes to finding a need and filling it.
I guess it depends on what your goal is. My goal is making this startup work, but I'm in a dysfunctional market. I think I'd rather not know the real odds.
Riding on the subway earlier today, a guy got on an starting preaching how "the lord saves" and "you must embrace Jesus." I wish math could create equivalent evangelism. I'd love to live in a world where a guy gets on the subway and starts preaching the Pythagorean theorem?
If you want a profound analysis, he wrote a (fiction) book recently on this subject, and put it out on the internet under a CC license. It's an enjoyable read; he pretty much distilled a few bits from the book to make that article.
I think it was about as in-depth as you are likely to get published in a (UK) national newspaper. One small but well-written step in the fight against the innumerate sensationalists that dominate the press and no doubt contribute to the populist over-reactions from the UK government.
Something doesn't seem right about Doctorow's example of attacks on children. He writes, "But the fact is that attacks by strangers are so rare as to be practically nonexistent. If your child is assaulted, the perpetrator is almost certainly a relative (most likely a parent)."
I'm sure that's true, but it doesn't answer the real question. For most parents, the question is: given my child's particular environment, what is the greatest threat to him?
Doctorow is saying: given that your child was attacked (and no additional information), the attacker is more likely than not to be a relative. That seems backwards.
He's stating that the evidence is P(Was Relative | Child Attacked) is relatively high, especially compared to P(Was Total Stranger | Child Attacked). This means that it should, if you use Bayes' Theorem correctly in everyday life, shift your suspicion away from the random photographer.
Of course, since P(Child Attacked) is so very low it's still not a huge deal.
It's not really backwards. That sort of inversion is exactly part of Bayes' Theorem.
If you are related to X people then the odds that one of them attacks your child is greater than that anyone else on the planet.
Picking numbers from thin air:
Let's say are related to 30 people know 200 and you encounter 10,000 random strangers and you have a 1 in 100 chance of an attack. Well the odds a specific random stranger attacks your child given the above assumptions is less than 1/2,000,000. The odds a specific person they know attacks your child would be less than 1/20,000 and the odds that a given relative attacks your child would be more than 1/6,000. Now who would you focus on? (What about a predator that has a 50% chance of attacking someone well he is still under 1/6000 because there are so many other people for them to attack.)
PS: Given my child's particular environment you should focus on relatives and people you know.
It's not just the probability of an occurence that leads us to behave in an unreasonable manner. The other important thing is the impact of that 'rare occurence'. Most critics don't seem to take that in to account at all...
many if not most citizens of the USA do not understand the basic organization and functions of their government. most of them cannot name a single sitting supreme court justice, have never read the constitution, do not know how a bill becomes a law.