Assumption vs. Hypothesis – To the Death!

Something has been irritating me: The startup community uses the words assumption and hypothesis interchangeably.

We also rarely use hypothesis correctly, often referring to laughably vague statements as hypotheses such as this gem I overheard:

Our hypothesis is that a $50k seed round will be enough to show traction.

Double face palm - Telling the difference between and hypothesis and an assumption should be this hard

I do the same damn thing, hence my irritation. So we should stop doing that, because it has an OVERSIZED effect on our next decision for our startup.

tl;dr: Assumptions should be challenged and clarified with research. Falsifiable hypotheses should be tested with an experiment.

Bonus: I’m writing a more complete version of how to design great experiments as an open source “Real Book”, you can get on the download list here:

icon - download sumome

 

The Dictionary

Let's ask the dictionary - What is an assumption anyway?While they are listed as synonyms by many dictionaries, they are really not the same word. Here’s a definition for Assumption from Merriam-Webster (because I’m too damn cheap to pay for the OED):

a fact or statement (as a proposition, axiom, postulate, or notion) taken for granted – Merriam-Webster

Here’s hypothesis:

an assumption or concession made for the sake of argument – Merriam-Webster

Oops…that’s almost identical and even used the word assumption, but not quite. It’s an assumption…for a specific purpose. Here’s a clearer definition.

a tentative assumption made in order to draw out and test its logical or empirical consequences – Merriam-Webster

Now that’s an interesting difference, and it’s important because depending on whether we have an assumption or whether you have a hypothesis, we should do two different things.

If we have an assumption, we accept the risk that the assumption is false and move on.

If we have a hypothesis, we attempt to falsify it.

Research Assumptions

If we look up a few more of assumption‘s numerous definitions we’ll also get a sprinkling of the religious roots of the word. That’s appropriate because the heart of the word is that we take it on faith.

For our startup, an assumption is usually something that we are not going to investigate. It’s something we will take on faith. We have many assumptions and they’re not all bad.

We might look at an analog to startup idea (a probiotic search engine) and see that the companies that sell probiotics have a lot of internet traffic and it’s growing month on month. We could then assume there is a sufficient market size to justify our interest.

That assumption may be disastrously wrong. Perhaps those companies are buying traffic with no profit to show for it, but we are free to make that assumption and take the risk.

If we have an assumption, we can either accept the risk or convert it into a testable hypothesis.

Hint: By “testable” I mean falsifiable.

Convert Assumptions into Hypotheses

AssumptionHypothesis
The market is large enough to support this business.There are 20,000 search queries per month using the term ‘probiotic’ and this number will grow by 20% next month.
Our product solves the problem.If a visitor shopping for probiotics comes to our landing page, they will enter a search query.
We’ll be able to raise an angel round really easily.If we send a cold email to 10 angel investors on Angellist.co we’ll be able to get 3 meetings within two weeks.

As you can see, all the assumptions are vague, optimistic, and untestable. The vaguer they are, the harder they are to disprove.

What makes a good hypothesis? The Hypotheses are relatively specific and we can easily see how to design an experiment to get the data that could disprove that hypothesis.

Hidden Assumptions

The most dangerous kind of assumption is the one we don’t know we have. In Rumsfeldian, that’s an “Unknown unknown.”

Entrepreneurial Optimism vs. Reality - What's the difference between assumptions and hypotheses?

 

To reveal hidden assumptions, there are a few tried and true generative research methods:

  • Use a framework such as the Business Model Canvas to list your assumptions
  • Have a peer challenge you with questions about your business model [hint: write them down]
  • Watch your customers try to solve their own problems
  • Talk to your customers!
When just starting, our biggest challenge is not to build an MVP, but to identify our own assumptions.

Ready to Test? What is a good hypothesis?

Before we get excited and start building anything and before we start talking about our hypothesis, let’s make sure it’s a real, falsifiable hypothesis and not just a vague assumption.

Look at the hypothesis and go through this checklist:

SymptomFix
Are there vague words like “some people” or “customer”?Be specific. Create a well defined customer persona.
Is it falsifiable? What evidence would convince a reasonable person that the hypothesis is wrong?Create a measurable hypothesis. Eliminate hedging words like “maybe,” “better,” “some” and convert to and IF ________ THEN ________ statement.
Is it actually risky?If it’s not truly risky, it’s not relevant and we don’t need to test it right now. (It may get more risky later and resurface.)
Has a second set of eyes looked at it?We all have blindspots. Check your work with another entrepreneur and ask them to tighten up the hypothesis.

What have I missed on this checklist? Got a tip? Add it in the comments below!

Bonus: I’m writing a more complete version of how to design great experiments as an open source “Real Book”, you can get on the download list here:

icon - download sumome

So…what should I post next? Tweet to tell me what to write:

Show me how to test product market fit!

or

How can I do lean startup in my friggin' huge company?

9 comments

  1. IMO, it’s less about using the wrong word and more about using inflated language. There any variety of reasons, all probably involving ramping up validity and importance. But, I just don’t have time right now to expand upon this theory.

    • Tristan says:

      Yes, totally agreed. About sounding scientific when we’re just guessing.

  2. its a good concept in general to get across the silly side of how people assume things without testing them. I think you’re missing a few things though that could make this article really make a very important point. You need to make the case you’re using as an example more specific with more context to better illustrate the issue so its not so vague in its variables (known and unknown). I get the gist of what you’re saying but it could be more specific. When you move to show assumption to hypothesis your hypothesis needs to also be more specific as you’re not illustrating the variables being tested in that hypothesis or what an experiment looks like to validate or falsify it properly.

    ie.. our product (what is the product) a search engine specifically geared towards one vertical? ok.. show me a graphic of that so I can see what product looks like. What am I expecting them to do to know this product qualifies as something they find useful? 20k queries is a success metric for successful search engine? not sure I understand how that metric was obtained. That benchmark is important to identify to know if you are aligned to the market properly so you can weed out historical defects in data, noise, technical issues, anomalies, etc. Knowing how to set a bench mark for traction would be important.

    ie.. cold email to customer list x results in y result. ok.. that is a bit better out of the 3 you listed but cold email even has variables that must be tested to examine if its the subject, body, contact used, from address used, length, time sent, personalized or robot sounding, etc. They might send a cold email and get no response so that could validate its not as easy as they thought but that wouldn’t tell them much in learning. You did setup a success metric expected in your hypothesis but its a bit arbitrary in determining what is used as criteria for easy. So that would be good to identify what people expect easy to look like as it may vary. 2 weeks to get 3 meetings might give wrong impression. The important part is to have a success metric to evaluate against that is specific and not vague so I get some of your point but just thought more specificity would bring it out more to illustrate what is important and why. Ie.. no success metric.. no specifics = poor data results = poor decision making.

    What I wanted to see what context established with more specificity around your product example that was being approached with generalities versus specifics. I couldn’t visualize or relate to the product you mentioned so I had trouble relating to the examples. The examples did show moving from general to specific but those hypothesis weren’t specific enough for me and there was no experiment to show what you meant by assumption is something you think is true that needs to be transformed into a hypothesis and then the hypothesis needs to transform into an actual experiment. A hypothesis is something you think is true and so is an assumption. People are using these interchangeably because of how they relate to the definition of the word.

    The important point is if you assume something to be true then you must prove it to be false by evaluating the variables in a controlled experiment to validate the result you expect can be repeated when you manipulate said variables the same way over and over again. ie.. poor 1lb salt on road and plants within 2 ft next to road will die. Transforming that to the business realm of success metrics which can be some behavior you expect when you manipulate a certain variable is not easy because people are missing the data points or variables they require to test. They are also not understanding the framework either. So one challenge of your points is how to use the framework to approach an assumption so it can improve ones decision making. The other is what to test and why and where it fits into some sort of ecosystem that drives other variables that will contribute to their business model working in all parts together for a profit with the traction they need to know this is something worth pursuing.

    The graphics you chose also confused me as I was unsure what you were trying to get across with the bubbles. I think you were trying to convey some sort of perspective but the titles confused me with the compare and contrast example used. My current way of thinking (what you are trying to establish I think in a graphic) and the way I should think (the way that will help me avoid disaster). I think a better graphic might convey the symbolism you meant there.

    I took the time to critique this because I think this here is a great article to start but it is also the root of the issue with how people approach business in that there are really key points missing that if clarified and understood at a deeper level could prevent people from poor decision making and bad paradigms when approaching business.

    thanks for the effort to write it so I had something to critique 🙂

    • Tristan says:

      Awesome Steve! There’s some great stuff in that response.

      Mostly I agree, one thing I’ll clarify, one I’ll object to:

      re the illustration: Mostly just meant to be funny. But definitely could have been funnier and more to the point. I find drawing a clear circle challenging and had to photoshop my handwriting to be legible. So…agreed…could be way better.

      re an illustration of the probiotic product: I have no idea how to draw that. I draw like a two year old sniffing glue.

      Examples of experiments: I also agree with you here. I prefer in this case to create a hypothesis, then figure out what metric would help me ascertain validity, then figure out how to gather that data. So I generally keep those as separate steps. (Just how my brain works, I’ve seen others tackle experiment design in one fell swoop.)

      Here, I’m just talking about step one. That could be clearer.

      Success metrics: Here I disagree, I hate success metrics and think they are largely worthless. I prefer to look at Fail Conditions. If you don’t hit a success metric, then it invariably gets interpreted by startups as “inconclusive.” Then they build more and hope for a larger sample size later.

      In science, we never prove a theory, we just fail to disprove it. Hence the Theory of Gravity rather than the Fact of Gravity.

      So I like to set a fail condition which would tell me that, without question, it is not worth investing any more effort in proving a hypothesis….even if the sample size is small! (As it often is.)

      We can never truly validate, we can only invalidate hypotheses. What we often call validation for a hypothesis is really just permission to continue testing with a large sample size and a higher fidelity.

      …and that is absolutely worth another post.

      • I hear you. let me try to bring you into my reading experience of your article. I am looking for a point to be made, then supporting examples of that point, then a conclusion. I don’t actually disagree with failure condition. To me that is a success metric because it meets my criteria to spend more time on something. Its semantics so I can see how we can interpret them to be different though. Here are some challenging things for people trying to apply lean.

        1. The sequencing of the thinking to problem solve.

        2. What is a thing they are struggling with? What is the hypothesis that is causing that struggle? What are the components needed to solve my problem to protect me from wasting my time? So sure, you could set a fail condition to say .. If X happens then I’m done. Its a fail. Lets take Space X for example since its visible and scientific and current. They are trying to disprove that you can launch and land a rocket back on a launch pad to recover it. So they go about all the variables that will eventually make that false and they will have landed it. So fail condition is 3 rockets die and can’t land so I give up as I proved this is not possible? So I look at that situation and say.. this is no way possible. I will set out to disprove rockets can launch and land again and be recovered. If I cannot do it in 5 launches then the money is done and we are calling it quits. That is the thinking instead of success metrics that they learn from and keep adjusting until they get the validation they really need? I see your point about hopefortunity versus real opportunity. I see point about self bias. I see your point about going down a road for too long thinking little successes will eventually turn big and that is what causes people to go into debt on projects that should have been killed long ago. But I think that could just be poor success metrics and unrealistic expectations and using experiments to see what you WANT to see versus the REALITY of what is there that tells you that you should stop.

        I think your point is use experiments and lean to protect your financial, emotional and spiritual bank account from being destroyed by your own bias. I think you should use both though -success metric (specific and real) and fail condition so there is an acceptable range. 19% and 20% is arbitrary for a metric to be valuable. One needs more metrics there to tell the whole story. I think maybe you might be saying don’t just rely on success metrics but make sure you make a fail condition as well so you set a limit. When setting that limit make sure its informed though or that is just more being arbitrary in trying to define failure as well. Space X hit the platform but destroyed the rocket 2x’s so far. It did hit the pad though. If it went into the ocean completely 3x’s maybe that would be time to stop and re-evaluate completely cause you made no real progress after $200million spent.

        To flip that to a business example maybe gettin 10 customers means nothing if I made success metric 11 or 15. Maybe one should look at the type of customer and revenue and other metrics to see if they have something worth pursuing or not. I think most success metrics are just made up and you make a valid point because most making them have no idea why its the number they say. Its not informed. So they would be just as bad at making an informed fail condition too unfortunately. Maybe if they used both and said…I need to see 10 customers of type X pay $ and sign up and stay for X time and then make a fail condition with it that says if I see 10 customers and x fall off and leave in x time then this is not worth bothering because traction isn’t really there or we need to fix product. It helps isolate and give a range of success and failure and keep one on solid ground as they can now see the fail side is happening so their success metric isn’t so successful afterall. It could help them readjust to a more realistic success metric and self calibrate for bias.

        3. What is the scientific method and how do we apply that framework to business thinking instead of science thinking where variables are seen differently ie.. apply 2lbs salt on road for 1day (variable A) to see if plant B(type specified) shrinks or dies (dependent variable B). Experiment would measure salts effect on plant death. Falisfiable hypthesis might be simple like.. road salt kills plants within 2hrs. If you can’t prove that then its not true. The concept is to evaluate variables that create some type of result we seek or do not seek. Now translating that into a business experiment is not as easy for people to think about because they don’t really understand the fundamentals of the business science. Salt is now suddenly some other odd variable they are not even sure is a driver or variable in a larger equation. So they need to understand how to even choose variables and then setup an equation for the learning. Then they have to move to the actual experiment which will look different.

        I think a challenge for lean thinking is giving people a general sense of business model dynamics in how one area impacts another. Just getting a sense of the scientific method itself can accelerate ones adaptation of that to the business realm.

        I could run dumb experiments all day long just to help me understand the method before trying to apply to an area that has risks and money associated to it.

        ie.. If I style my hair this way.(a). Girls will respond (b) and I will get 2 dates in 1 week. (success metric) fail condition (after 1 week no dates at all or weird looks from girls or 5 rejections). Time to change style of hair on fail condition. This is a crappy experiment still. There are so many other variables that could have contributed that my hair style might have not even been a part of the success at all. This is similar to your penguin concept about them flying. But lets get really clear here on what I want to know in learning. The simple thing here is for me to not keep spending money on different hair styles as I might have gotten 2 dates thinking this was working now. I might have bias here and be wasting money on hair stylists for weeks on end thinking I see some success. How could I falsify this notion quickly and cheaply? Thinking this way makes me step away from wanting it to work or trying to see glimmer of hope. The thinking itself is what you’re trying to get at I think. The approach to the experiment keeps you distant from results. Success metric you say makes me think 2 dates validated the experiment so it works. Fail condition makes me see reality in that I have to be really serious about how controlled this experiment is and what is really working here to get the dates. It makes me more objective. I think you need both so you can learn. 2 dates really means nothing in success terms or validation is your point. I agree. But it gets me started. I can now go back and see from those 2 dates why they actually decided to go out and if both mentioned the hair style just drew them in then I could say.. ok. interesting but not sufficient. Lets try that on 5-10 girls because I want to prove this wrong. So the success metrics give you some parameters in your fail conditions I think. It helps confirm, require more exploration, or totally knock down your original idea.

        I hear you about invalidating things to keep refining what is actually causing the results. Totally agree. I think the success metrics in how I think about it help create a range of acceptability versus just accepting success criteria as validated.

        What do you think?

        • Tristan says:

          I think we are vehemently agreeing on all but the framing.

          Success Metrics vs. Failure Condition is not a semantic argument because they frame the concept different and cause very different effects for most if not all humans.

          A similar reframing is asking someone to choose the most important out of ten features to build for their MVP. That’s an agonizing decision taking place over hours which usually winds up with the entrepreneur unable to choose just one feature and building several, if not all, the proposed features.

          Instead, we can ask the entrepreneur to select the LEAST important of the ten features and most likely they will be able to decide within a few seconds. Do this 9 more times and you wind up with the most important feature.

          Does this make logical sense? No.

          It is the same question, reframed. But one of them is very easy to answer and results in a quick decision, the other is agonizing and takes forever, often resulting in no decision.

          It’s kind of silly, but the framing makes a huge impact and there are dozens of cognitive biases such as the endowment effect that show this time and time again. I’ve been doing the above example for years and it’s amazing how quickly a group of people can agree on the least important thing rather than the most important thing.

          …of course…why would they argue about the least important thing! It’s the least important thing!

          • I looked back at what I wrote and didn’t articulate semantics issue properly so I see where you got a different point from what I said.

            Lets start here to frame this properly.
            Success metric vs fail condition. I don’t see it as either or situation. I see them complimenting each other as each forces a different way of thinking about something. So you are very correct in that they are VERY different. I chose a poor word. What I think I should have said was… I need them both to create a parameter to examine a situation. Two data points if you will to create a range. I here you about time churn to make a decision and the idea to just get people moving to fail quickly instead of gyrating around in their head. I agree. Asking what are top 10 features versus what are least important will reframe the situation to get things moving but that is not an example to illustrate fail condition thinking to support your argument so I can grasp what you want to say about fail conditions and success metrics. I get the idea about disproving something versus proving something to avoid bias issues. Ie.. I will disprove 50 people will buy my lawn mowing service for $400? I still feel what you’re saying is not drawing the connecting points clearly for me though with solid examples. It seems a bit scattered for me with examples you’re using. We went from penguins flying and disproving they can fly to least important 10 features in software. Help me with some better concrete examples with a context that stays the same so I can track with you better.

            Framing is very important. Are you trying to say 10 top features question is a success metric question asking an entrepreneur to name the top 10 features thus requiring setting up tests to test all of those assumptions about the features importance level to a user? And this is laborious and time consuming? So instead of that you ask the entrepreneur to just list the least important features 1 at a time 9x’s? But where are they getting those assumptions they aren’t important is my question? That is bias as well cause they are assuming they know what is non-essential or essential. What phase are your referring to with this question? Problem/solution validation to make sure they understand the problem and the potential solutions to it? Or product/market fit where they already know enough data from testing and now they are concretizing their MVP for product/market fit stage? Are they validating the problem or the solution? or the solution to the market to see if they get traction? Sounds like you chose a specific issue here in addressing my response in lean framework which is WHAT TO BUILD.

            Sounds like you’re saying… fail fast and cheap instead of spinning wheels trying to figure it out in your head and seeing success metrics as traction when it might not be traction but bias? I think what causes this issue of time churn you mention or bias is a lack of understanding the customer enough and not knowing what kind of stimuli or solution to present to them because they lack a true understanding of the problem enough which makes ranking what to build become this laborious task?

            You might be saying… I have $500 to spend to make an MVP cause I am at product/market fit stage or problem/solution fit stage and want to validate my assumptions here about the solutions I’ve tested thus far. Now to take it further I need to know What should I make at higher fidelity to really prove this thing out to keep going or stop? The potential loss of $500 makes that choice of top 10 features (if its software)really tough because of risk of loss which makes people more cautious. So you’re saying build some low fidelity MVP by selecting things you think people will not find important at all and get feedback with a test to validate that instead of spending 3 weeks trying to become karnak at guessing what people really want. ?? That is the build/measure/learn loop I guess. What you stated as an approach is basically http://www.livescience.com/21569-deduction-vs-induction.html

            Least important features list of 10 is a fail condition? Still not sure I can see your example as a fail condition example. One article you’re talking about making success criteria and the next you say to make a fail condition instead?

            help me out. Give me a solid example so I can see your point better.

            My point was to use both data points. Make your hypothesis, use some success criteria to get a sense of traction, set a fail condition to help you calibrate to what is really successful or a failure to avoid bias of your success or perceived success until you really understand what success looks like in your situation.

            Penguins flying is hard for me to relate to with example. Top 10 feature list is hard as well cause I’m not sure of the context or stage or what you are considering the fail condition in that scenario. Can you better identify a good example of fail condition for the top 10 software feature list?

            thanks
            steven

          • Tristan says:

            My own blog won’t let me leave a further nested reply so I’ll do so here.

            Forget about the penguins or features. The only point I am making is in regards to framing. Framing is important and can radically impact how we interpret anything.

            Success vs. Fail is certainly very different. I agree that for very sophisticated experimenters, having a success condition might be appropriate. I don’t like one size fits all frameworks in any case.

            However for most people that I interact with, I would never recommend setting a success condition for any test, let alone both a success and fail condition.

            If we set both, then we’re also unfortunately defining a middle ground of “not enough data” or “indeterminate.”

            In order to be a good bit of research or experiment, we need to come away from it with a clear next step and take action.

            By setting a fail condition on an experiment, we have two possibilities:

            1) The test invalidates our hypothesis and we should pivot.
            2) We do NOT invalidate our hypothesis and we have permission to go to a higher fidelity test.

            In the case of setting both success and failure conditions you have three situations:
            1) The test invalidates our hypothesis and we should pivot.
            2) We succeed, in which case we go on…build the thing or whatever.
            3) The result is indeterminate.

            Now this IS a semantic distinction because we could reasonable say that 2+3 really just means: we have permission to run a higher test.

            However, I simply find that in practice, people take a result of “success” to mean “there is no more risk in this hypothesis.”

            The idea of a no risk hypothesis is not realistic to me. There are hypotheses with acceptable levels of risk, which might take as an assumption and move on, reasonably confident.

            Then there are hypotheses with unacceptable levels of risk, which we should test further.

            I think framing a hypothesis as “success” or “validated” often tricks us into thinking we shouldn’t investigate further.

            That may be the case, but we should decide the risk is low enough and not to investigate further consciously, not simply because we framed the hypothesis as “validated.”

  3. Pingback: Success Metric vs. Fail Condition – To the Pain! by @TriKro

Leave a Comment

Your email address will not be published. Required fields are marked *