I'm not sure I agree about the qualitative stuff though. I said this in another comment, but if you had two pictures of two different apples, you wouldn't need to classify either picture as anything to blend them together. You could just...smash them together. But it's not literally just mixing up the pixels; it's using a whole bunch of other pictures to figure out how you can sensibly interpolate between the two.
That's basically what humans do. If we're asked to summarize ten support tickets. We compare the tickets to a bunch of other language we're familiar with, figure out what in those tickets is unique, and them kinda jumble it all up, where we highlight those unique things, bridging the language in the tickets with patterns from more general language so that it all makes sense. We can do that without ever using any sort explicit or implicit qualitative encodings. We just...kinda average the language.
even math is qualitative though... we have rational, irrational, and imaginary numbers just to name a few. Yes, it might make sense to combine these sometimes but deciding whether to combine them or not means that we have some kind of methodology or decision system which helps us decide whether or not things are similar enough to be aggregated or whether they should be grouped into a different category.
The specific definition of "average" might have different definitions (geometric mean, etc) but the concept of averaging is a well-defined encoding.
Yes, it is possible to combine things without having a methodology but whatever is doing the combining is still biased by accessibility and proximity--but this system without constraints is likely to approximate cosmic latte. To be useful, any AI system needs to group things into matrices rather than a single float value. Each matrix index is a bin.
Sure, the various models are all math, and all that. And there are different ways to do that math, both if the math is some LLM black box magic or if it's just math math. But a bunch of crazy matrix math with a tons of parameters is very different than something that tries to category a thing by color or some other human-understandable definition. If you add a billion bins, is that really binning?
Yes, it is still binning but it might not be an interesting type of binning--it might just be an implementation detail of a higher-level system. Similar to how UUIDs are not business keys but they *might* be on a 1:1 relationship with something interesting. If a matrix is just storage it depends on the alignment with sample rate and SNR how much noise is being stored. I say implementation detail because there might be redundancies in storage, similar to how there are many synapses in humans which duplicate information... So there might be 10 bins which equal 1 category of things or 100,000 bins which points to the same thing. Do these additional bins add value? It depends. A camera might have 100 megapixels but if the lens isn't focused or uniform across the array then you'll get a lot of blurry or redundant information. But all of this is not really my main point.
You could have 1,000 surveys with many different quantitative prompts to try to understand why people don't like a bar or how you can get people to like it more but you can't send all of these surveys. Getting feedback has a cost, so how do you know the right survey to send, the right questions to ask? If you don't do a survey with qualitative prompts first then you are biased towards your ideas about what a "bar survey" should look like, or you pre-suppose all the the things wrong with the business. Maybe it is useful to validate against an existing hypothesis, but maybe its not.
I'm not disagreeing with the conclusions of your article but I'm curious how you came to those conclusions if you don't believe that quantitative encoding (counting, aggregation, modeling) builds off of qualitative selection.
I think that I would agree that quant analysis (either surveys or analysis of behavioral stuff) is often best built on qual inputs (surveys, feedback, some instincts like, "this bar smells weird to me, personally"). That's the usual way this seems to work - develop a question with qual; analyze with quant; sometimes, confirm and get the "why" with qual.
My question is more, do we actually need the quant analysis? That rough arc has become so ingrained that I think we assume quant is necessary, but I'm not sure we just don't do it because of the limitations of qual: We can't do it at scale, can't do math on it, it's mostly anecdotal, etc. But if you could capture 10-100x more qual feedback and aggregate it in reliable ways, it seems like you could make a lot of decisions without the quant part.
Ah, nice, good catch. Words are hard.
I'm not sure I agree about the qualitative stuff though. I said this in another comment, but if you had two pictures of two different apples, you wouldn't need to classify either picture as anything to blend them together. You could just...smash them together. But it's not literally just mixing up the pixels; it's using a whole bunch of other pictures to figure out how you can sensibly interpolate between the two.
That's basically what humans do. If we're asked to summarize ten support tickets. We compare the tickets to a bunch of other language we're familiar with, figure out what in those tickets is unique, and them kinda jumble it all up, where we highlight those unique things, bridging the language in the tickets with patterns from more general language so that it all makes sense. We can do that without ever using any sort explicit or implicit qualitative encodings. We just...kinda average the language.
even math is qualitative though... we have rational, irrational, and imaginary numbers just to name a few. Yes, it might make sense to combine these sometimes but deciding whether to combine them or not means that we have some kind of methodology or decision system which helps us decide whether or not things are similar enough to be aggregated or whether they should be grouped into a different category.
The specific definition of "average" might have different definitions (geometric mean, etc) but the concept of averaging is a well-defined encoding.
Yes, it is possible to combine things without having a methodology but whatever is doing the combining is still biased by accessibility and proximity--but this system without constraints is likely to approximate cosmic latte. To be useful, any AI system needs to group things into matrices rather than a single float value. Each matrix index is a bin.
Sure, the various models are all math, and all that. And there are different ways to do that math, both if the math is some LLM black box magic or if it's just math math. But a bunch of crazy matrix math with a tons of parameters is very different than something that tries to category a thing by color or some other human-understandable definition. If you add a billion bins, is that really binning?
Yes, it is still binning but it might not be an interesting type of binning--it might just be an implementation detail of a higher-level system. Similar to how UUIDs are not business keys but they *might* be on a 1:1 relationship with something interesting. If a matrix is just storage it depends on the alignment with sample rate and SNR how much noise is being stored. I say implementation detail because there might be redundancies in storage, similar to how there are many synapses in humans which duplicate information... So there might be 10 bins which equal 1 category of things or 100,000 bins which points to the same thing. Do these additional bins add value? It depends. A camera might have 100 megapixels but if the lens isn't focused or uniform across the array then you'll get a lot of blurry or redundant information. But all of this is not really my main point.
You could have 1,000 surveys with many different quantitative prompts to try to understand why people don't like a bar or how you can get people to like it more but you can't send all of these surveys. Getting feedback has a cost, so how do you know the right survey to send, the right questions to ask? If you don't do a survey with qualitative prompts first then you are biased towards your ideas about what a "bar survey" should look like, or you pre-suppose all the the things wrong with the business. Maybe it is useful to validate against an existing hypothesis, but maybe its not.
I'm not disagreeing with the conclusions of your article but I'm curious how you came to those conclusions if you don't believe that quantitative encoding (counting, aggregation, modeling) builds off of qualitative selection.
I think that I would agree that quant analysis (either surveys or analysis of behavioral stuff) is often best built on qual inputs (surveys, feedback, some instincts like, "this bar smells weird to me, personally"). That's the usual way this seems to work - develop a question with qual; analyze with quant; sometimes, confirm and get the "why" with qual.
My question is more, do we actually need the quant analysis? That rough arc has become so ingrained that I think we assume quant is necessary, but I'm not sure we just don't do it because of the limitations of qual: We can't do it at scale, can't do math on it, it's mostly anecdotal, etc. But if you could capture 10-100x more qual feedback and aggregate it in reliable ways, it seems like you could make a lot of decisions without the quant part.