Algorithms to Live By - The Computer Science of Human Decisions

Brian Christian and Tom Griffiths

Page: 6 These hard-won precepts are at odds with our intuitions

Page: 6 They say: Don’t always consider all your options. Don’t necessarily go for the outcome that seems best every time. Make a mess on occasion. Travel light. Let things wait. Trust your instincts and don’t think too long. Relax. Toss a coin. Forgive, but don’t forget. To thine own self be true.

Page: 10 In any optimal stopping problem, the crucial dilemma is not which option to pick, but how many options to even consider.

Page: 14 the exact place to draw the line between looking and leaping settles to 37% of the pool,

Page: 14 Even when we act optimally in the secretary problem, we will still fail most of the time—

Page: 15 Thus the bigger the applicant pool gets, the more valuable knowing the optimal algorithm becomes.

Page: 20 Any yardstick that provides full information on where an applicant stands relative to the population at large will change the solution from the Look-Then-Leap Rule to the Threshold Rule and will dramatically boost your chances of finding the single best applicant in the group.

Page: 22 The critical thing to note in this problem is that our threshold depends only on the cost of search.

Page: 27 Surprisingly, not giving up—ever—also makes an appearance in the optimal stopping literature.

Page: 28 The math shows that you should always keep playing. But if you follow this strategy, you will eventually lose everything. Some problems are better avoided than solved.

Page: 29 because the flow of time turns all decision-making into optimal stopping.

Page: 31 do we try new things or stick with our favorite ones?

Page: 32 the explore/exploit tradeoff.

Page: 32 Simply put, exploration is gathering information, and exploitation is using the information you have to get a known good result.

Page: 34 When balancing favorite experiences and new ones, nothing matters as much as the interval over which we plan to enjoy them.

Page: 34 A sobering property of trying new things is that the value of exploration, of finding a new favorite, can only go down over time, as the remaining opportunities to savor it dwindle.

Page: 35 So explore when you will have time to use the resulting knowledge, exploit when you’re ready to cash in. The interval makes the strategy.

Page: 35 From a studio’s perspective, a sequel is a movie with a guaranteed fan base: a cash cow, a sure thing, an exploit. And an overload of sure things signals a short-termist approach, as with Stucchio on his way out of town. The sequels are more likely than brand-new movies to be hits this year, but where will the beloved franchises of the future come from? Such a sequel deluge is not only lamentable (certainly critics think so); it’s also somewhat poignant. By entering an almost purely exploit-focused phase, the film industry seems to be signaling a belief that it is near the end of its interval.

Page: 36 they’re pulling the arms of the best machines they’ve got before the casino turns them out.

Page: 36 choose an arm at random, and keep pulling it as long as it keeps paying off.

Page: 36 win-stay turns out to be an element of the optimal strategy for balancing exploration and exploitation under a wide range of conditions.

Page: 36 Changing arms each time one fails is a pretty rash move.

Page: 38 He conceived the goal as maximizing payoffs not for a fixed interval of time, but for a future that is endless yet discounted.

Page: 38 value assigned to payoffs decreases geometrically:

Page: 38 that is, each restaurant visit you make is worth a constant fraction of the last one. If, let’s say, you believe there is a 1% chance you’ll get hit by a bus on any given day, then you should value tomorrow’s dinner at 99% of the value of tonight’s, if only because you might never get to eat it.

Page: 41 The Gittins index, then, provides a formal, rigorous justification for preferring the unknown, provided we have some opportunity to exploit the results of what we learn from exploring.

Page: 41 Exploration in itself has value, since trying new things increases our chances of finding the best. So taking the future into account, rather than focusing just on the present, drives us toward novelty.

Page: 42 “To try and fail is at least to learn; to fail to try is to suffer the inestimable loss of what might have been.”

Page: 43 Herbert Robbins

Page: 43 First, assuming you’re not omniscient, your total amount of regret will probably never stop increasing, even if you pick the best possible strategy—because even the best strategy isn’t perfect every time. Second, regret will increase at a slower rate if you pick the best strategy than if you pick others; what’s more, with a good strategy regret’s rate of growth will go down over time, as you learn more about the problem and are able to make better choices. Third, and most specifically, the minimum possible regret—again assuming non-omniscience—is regret that increases at a logarithmic rate with every pull of the handle.

Page: 44 guarantee of minimal regret. Of the ones they’ve discovered, the most popular are known as Upper Confidence Bound algorithms.

Page: 44 Upper Confidence Bound algorithm says, quite simply, to pick the option for which the top of the confidence interval is highest.

Page: 45 Upper Confidence Bound algorithms implement a principle that has been dubbed “optimism in the face of uncertainty.”

Page: 45 In the long run, optimism is the best prevention for regret.

Page: 52 In general, it seems that people tend to over-explore—to favor the new disproportionately over the best.

Page: 54 To live in a restless world requires a certain restlessness in oneself. So long as things continue to change, you must never fully cease exploring.

Page: 62 This is the first and most fundamental insight of sorting theory. Scale hurts.

Page: 64 “Big-O of one,” written O(1), it is also known as “constant time.”

Page: 64 “Big-O of n,” written O(n), also known as “linear time”—with twice the guests, you’ll wait twice as long

Page: 64 What’s more, the existence of any linear-time factors will, in Big-O notation, swamp all constant-time factors.

Page: 64 n-squared,” written O(n2) and also known as “quadratic time.”

Page: 65 Constant time, written O(1); linear time, written O(n); and quadratic time, written O(n2).

Page: 65 There’s “exponential time,” O(2n), where each additional guest doubles your work.

Page: 65 “factorial time,” O(n!),

Page: 68 “Mergesort is as important in the history of sorting as sorting in the history of computing.”

Page: 68 O(n log n), known as “linearithmic”

Page: 72 Err on the side of messiness. Sorting something that you will never search is a complete waste; searching something you never sorted is merely inefficient.

Page: 73 but your email inbox almost certainly is—and it’s another domain where searching beats sorting handily.

Page: 81 dominance hierarchies are ultimately information hierarchies.

Page: 82 Being able to assign a simple numerical measure of performance results in a constant-time algorithm for status.

Page: 82 This move from “ordinal” numbers (which only express rank) to “cardinal” ones (which directly assign a measure to something’s caliber)

Page: 83 Much as we bemoan the daily rat race, the fact that it’s a race rather than a fight is a key part of what sets us apart from the monkeys, the chickens—and, for that matter, the rats.

Page: 84 In the practical use of our intellect, forgetting is as important a function as remembering. —WILLIAM JAMES

Page: 88 Depend upon it there comes a time when for every addition of knowledge you forget something that you knew before. It is of the highest importance, therefore, not to have useless facts elbowing out the useful ones. —SHERLOCK HOLMES

Page: 90 But despite an abundance of innovative caching schemes, some of which can beat LRU under the right conditions, LRU itself—and minor tweaks thereof—is the overwhelming favorite of computer scientists,

Page: 90 The nearest thing to clairvoyance is to assume that history repeats itself—backward.

Page: 98 Recognizing the Noguchi Filing System as an instance of the LRU principle in action tells us that it is not merely efficient. It’s actually optimal.

Page: 100 The key to a good human memory then becomes the same as the key to a good computer cache: predicting which items are most likely to be wanted in the future.

Page: 100 If the pattern by which things fade from our minds is the very pattern by which things fade from use around us, then there may be a very good explanation indeed for the Ebbinghaus forgetting curve—namely, that it’s a perfect tuning of the brain to the world, making available precisely the things most likely to be needed.

Page: 101 However, these criticisms fail to appreciate the task before human memory, which is to try to manage a huge stockpile of memories.

Page: 101 In any system responsible for managing a vast data base there must be failures of retrieval. It is just too expensive to maintain access to an unbounded number of items.”

Page: 102 The need for a computer memory hierarchy, in the form of a cascade of caches, is in large part the result of our inability to afford making the entire memory out of the most expensive type of hardware.

Page: 102 Unavoidably, the larger a memory is, the more time it takes to search for and extract a piece of information from it.

Page: 103 what we call “cognitive decline”—lags and retrieval errors—may not be about the search process slowing or deteriorating, but (at least partly) an unavoidable consequence of the amount of information we have to navigate getting bigger and bigger.

Page: 104 So as you age, and begin to experience these sporadic latencies, take heart: the length of a delay is partly an indicator of the extent of your experience. The effort of retrieval is a testament to how much you know. And the rarity of those lags is a testament to how well you’ve arranged it: keeping the most important things closest to hand.

Page: 108 We can’t declare some schedule a winner until we know how we’re keeping score.

Page: 108 before you can have a plan, you must first choose a metric.

Page: 109 Do the difficult things while they are easy and do the great things while they are small. —LAO TZU

Page: 111 This debt-reduction strategy says to ignore the number and size of your debts entirely, and simply funnel your money toward the debt with the single highest interest rate.

Page: 112 “denial of service” attack: give a system an overwhelming

Page: 112 number of trivial things to do, and the important things get lost in the chaos.

Page: 112 “this seemingly irrational choice reflected a tendency to pre-crastinate, a term we introduce to refer to the hastening of subgoal completion, even at the expense of extra physical effort.”

Page: 112 It’s not that they have a bad strategy for getting things done; they have a great strategy for the wrong metric.

Page: 113 This starts with making sure that the single-machine problem we’re solving is the one we want to be solving.

Page: 113 Staying focused not just on getting things done but on getting weighty things done—doing the most important work you can at every moment—

Page: 115 “Things which matter most must never be at the mercy of things which matter least,”

Page: 115 Sometimes that which matters most cannot be done until that which matters least is finished, so there’s no choice but to treat that unimportant thing as being every bit as important as whatever it’s blocking.

Page: 119 In fact, the weighted version of Shortest Processing Time is a pretty good candidate for best general-purpose scheduling strategy in the face of uncertainty. It offers a simple prescription for time management: each time a new piece of work comes in, divide its importance by the amount of time it will take to complete. If that figure is higher than for the task you’re currently doing, switch to the new one; otherwise stick with the current task.

Page: 119 When the future is foggy, it turns out you don’t need a calendar—just a to-do list.

Page: 120 Second, preemption isn’t free. Every time you switch tasks, you pay a price, known in computer science as a context switch.

Page: 120 Psychologists have shown that for us, the effects of switching tasks can include both delays and errors—at the scale of minutes rather than microseconds. To put that figure in perspective, anyone you interrupt more than a few times an hour is in danger of doing no work at all.

Page: 120 Personally, we have found that both programming and writing require keeping in mind the state of the entire system, and thus carry inordinately large context-switching costs.

Page: 120 “If it’s less than an hour I’ll just do errands instead, because it’ll take me the first thirty-five minutes to really figure out what I want to do and then I might not have time to do it.”

Page: 121 At its nightmarish extreme, this turns into a phenomenon called thrashing.

Page: 123 Another way to avert thrashing before it starts is to learn the art of saying no.

Page: 124 These two principles are called responsiveness and throughput: how quickly you can respond to things, and how much you can get done overall.

Page: 125 Establishing a minimum amount of time to spend on any one task helps to prevent a commitment to responsiveness from obliterating throughput entirely:

Page: 125 Methods such as “timeboxing” or “pomodoros,” where you literally set a kitchen timer and commit to doing a single task until it runs out, are one embodiment of this idea.

Page: 126 Decide how responsive you need to be—and then, if you want to get things done, be no more responsive than that.

Page: 127 regularly scheduled meetings are one of our best defenses against the spontaneous interruption and the unplanned context switch.

Page: 127 “Email is a wonderful thing for people whose role in life is to be on top of things. But not for me; my role is to be on the bottom of things. What I do takes long hours of studying and uninterruptible concentration.”

Page: 129 If we be, therefore, engaged by arguments to put trust in past experience, and make it the standard of our future judgement, these arguments must be probable only. —DAVID HUME

Page: 132 In fact, for any possible drawing of w winning tickets in n attempts,

Page: 132 the expectation is simply the number of wins plus one, divided by the number of attempts plus two: (w+1)⁄(n+2).

Page: 132 Philosophical Essay on Probabilities,

Page: 134 problem of how to combine preexisting beliefs with observed evidence: multiply their probabilities together.

Page: 134 The methods developed by Bayes and Laplace can offer help any time you have uncertainty and a bit of data to work with.

Page: 134 And that’s exactly the situation we face when we try to predict the future.

Page: 135 More generally, unless we know better we can expect to have shown up precisely halfway into the duration of any given phenomenon.*

Page: 136 And it turns out that the Copernican Principle is exactly what results from applying Bayes’s Rule using what is known as an uninformative prior.

Page: 137 Recognizing that the Copernican Principle is just Bayes’s Rule with an uninformative prior answers a lot of questions about its validity.

Page: 138 The richer the prior information we bring to Bayes’s Rule, the more useful the predictions we can get out of it.

Page: 138 In the broadest sense, there are two types of things in the world: things that tend toward (or cluster around) some kind of “natural” value, and things that don’t.

Page: 138 This kind of pattern typifies what are called “power-law distributions.”

Page: 138 they characterize quantities that can plausibly range over many scales:

Page: 139 In fact, money in general is a domain full of power laws.

Page: 139 So it is: two-thirds of the US population make less than the mean income, but the top 1% make almost ten times the mean.

Page: 139 It’s often lamented that “the rich get richer,” and indeed the process of “preferential attachment” is one of the surest ways to produce a power-law distribution.

Page: 139 Good predictions thus begin with having good instincts about when we’re dealing with a normal distribution and when with a power-law distribution. As it turns out, Bayes’s Rule offers us a simple but dramatically different predictive rule of thumb for each.

Page: 139 And for any power-law distribution, Bayes’s Rule indicates that the appropriate prediction strategy is a Multiplicative Rule: multiply the quantity observed so far by some constant factor.

Page: 140 The larger the value of that single data point, the larger the scale we’re probably dealing with, and vice versa.

Page: 140 When we apply Bayes’s Rule with a normal distribution as a prior, on the other hand, we obtain a very different kind of guidance. Instead of a multiplicative rule, we get an Average Rule: use the distribution’s “natural” average—its single, specific scale—as your guide.

Page: 140 Something normally distributed that’s gone on seemingly too long is bound to end shortly; but the longer something in a power-law distribution has gone on, the longer you can expect it to keep going.

Page: 141 The Erlang distribution gives us a third kind of prediction rule, the Additive Rule: always predict that things will go on just a constant amount longer.

Page: 141 Indeed, distributions that yield the same prediction, no matter their history or current state, are known to statisticians as “memoryless.”

Page: 142 In a power-law distribution, the longer something has gone on, the longer we expect it to continue going on. So a power-law event is more surprising the longer we’ve been waiting for it—and maximally surprising right before it happens. A nation, corporation, or institution only grows more venerable with each passing year, so it’s always stunning when it collapses. In a normal distribution, events are surprising when they’re early—since we expected them to reach the average—but not when they’re late. Indeed, by that point they seem overdue to happen, so the longer we wait, the more we expect them. And in an Erlang distribution, events by definition are never any more or less surprising no matter when they occur. Any state of affairs is always equally likely to end regardless of how long it’s lasted. No wonder politicians are always thinking about their next election.

Page: 145 In cases where we don’t have good priors, our predictions aren’t good.

Page: 145 What we project about the future reveals a lot—about the world we live in, and about our own past.

Page: 146 If the marshmallow test is about willpower, this is a powerful testament to the impact that learning self-control can have on one’s life. But if the test is less about will than about expectations, then this tells a different, perhaps more poignant story.

Page: 147 It could be a result of believing that adults are not dependable: that they can’t be trusted to keep their word, that they disappear for intervals of arbitrary length.

Page: 147 As if someone were to buy several copies of the morning paper to assure himself that what it said was true. —LUDWIG WITTGENSTEIN

Page: 148 There’s a curious tension, then, between communicating with others and maintaining accurate priors about the world.

Page: 148 the representation of events in the media does not track their frequency in the world.

Page: 148 protect your priors.

Page: 148 Counterintuitively, that might mean turning off the news.

Page: 157 “It really is true that the company will build whatever the CEO decides to measure.”

Page: 158 “Friends don’t let friends measure Page Views. Ever.”

Page: 160 If you can’t explain it simply, you don’t understand it well enough. —ANONYMOUS

Page: 161 introduce an additional term to your calculations that penalizes more complex solutions.

Page: 161 Computer scientists refer to this principle—using constraints that penalize models for their complexity—as Regularization.

Page: 161 is called the Lasso and uses as its penalty the total weight of the different factors in the model.*

Page: 161 On the other hand, we can also infer that a substantially more complex brain probably didn’t provide sufficient dividends, evolutionarily speaking. We’re as brainy as we have needed to be, but not extravagantly more so.

Page: 162 But it’s precisely because of the complexity of real life that a simple heuristic might in fact be the rational solution.

Page: 163 Gerd Gigerenzer and Henry Brighton have argued that the decision-making shortcuts people use in the real world are in many cases exactly the kind of thinking that makes for good decisions.

Page: 164 In contrast, if we look at the way organisms—including humans—evolve, we notice something intriguing: change happens slowly.

Page: 165 As a species, being constrained by the past makes us less perfectly adjusted to the present we know but helps keep us robust for the future we don’t.

Page: 165 In machine learning, the advantages of moving slowly emerge most concretely in a regularization technique known as Early Stopping.

Page: 173 The perfect is the enemy of the good.

Page: 175 If you can’t solve the problem in front of you, solve an easier version of it—and then see if that solution offers you a starting point, or a beacon, in the full-blown problem. Maybe it does.

Page: 178 When an optimization problem’s constraints say “Do it, or else!,” Lagrangian Relaxation replies, “Or else what?”

Page: 182 There is a deep message in the fact that on certain problems, randomized approaches can outperform even the best deterministic ones.

Page: 183 when we want to know something about a complex quantity, we can estimate its value by sampling from it.

Page: 184 In a sufficiently complicated problem, actual sampling is better than an examination of all the chains of possibilities.

Page: 185 there will always be some error associated with a sampling process, though you can reduce it by ensuring your samples are indeed random and by taking more and more of them.

Page: 185 But simulating it, with each interaction being like turning over a new card, provides an alternative.

Page: 185 Metropolis named this approach—replacing exhaustive probability calculations with sample simulations—the Monte Carlo Method,

Page: 188 The Miller-Rabin primality test, as it’s now known, provides a way to quickly identify even gigantic prime numbers with an arbitrary degree of certainty.

Page: 191 close examination of random samples can be one of the most effective means of making sense of something too complex to be comprehended directly.

Page: 193 when a man is capable of being in uncertainties, mysteries, doubts, without any irritable reaching after fact and reason.

Page: 193 “What we’re going to do is come up with an answer which saves you in time and space and trades off this third dimension: error probability.”

Page: 193 Bloom filter,

Page: 195 a greedy algorithm—for instance, always doing the shortest job available, without looking or planning beyond—

Page: 197 as “Random-Restart Hill Climbing”—or, more colorfully, as “Shotgun Hill Climbing.” It’s a strategy that proves very effective when there are lots of local maxima in a problem.

Page: 198 “The way to study [physical systems] was to warm them up then cool them down, and let the system organize itself. From that background, it seemed like a perfectly natural thing to treat all kinds of optimization problems as if the degrees of freedom that you were trying to organize were little atoms, or spins, or what have you.”

Page: 199 if you treated an optimization problem like an annealing problem—if you “heated it up” and then slowly “cooled it off”?

Page: 199 Taking the ten-city vacation problem from above, we could start at a “high temperature” by picking our starting itinerary entirely at random, plucking one out of the whole space of possible solutions regardless of price. Then we can start to slowly “cool down” our search by rolling a die whenever we are considering a tweak to the city sequence. Taking a superior variation always makes sense, but we would only take inferior ones when the die shows, say, a 2 or more. After a while, we’d cool it further by only taking a higher-price change if the die shows a 3 or greater—then 4, then 5. Eventually we’d be mostly hill climbing, making the inferior move just occasionally when the die shows a 6. Finally we’d start going only uphill, and stop when we reached the next local max.

Page: 199 Simulated Annealing,

Page: 199 Kirkpatrick and Gelatt’s simulated annealing algorithms

Page: 199 one of the most promising approaches to optimization problems known to the field.

Page: 201 “serendipity,”

Page: 201 “were always making discoveries, by accidents and sagacity, of things they were not in quest of.”

Page: 201 New conceptions, emotions, and active tendencies which evolve are originally produced in the shape of random images, fancies, accidental out-births of spontaneous variation in the functional activity of the excessively unstable human brain, which the outer environment simply confirms or refutes, adopts or rejects, preserves or destroys—selects, in short, just as it selects morphological and social variations due to molecular accidents of an analogous sort.

Page: 202 “we seem suddenly introduced into a seething caldron of ideas, where everything is fizzling and bobbing about in a state of bewildering activity, where partnerships can be joined or loosened in an instant, treadmill routine is unknown, and the unexpected seems the only law.” (Note here the same “annealing” intuition, rooted in metaphors of temperature, where wild permutation equals heat.)

Page: 202 “A blind-variation-and-selective-retention process is fundamental to all inductive achievements, to all genuine increases in knowledge, to all increases in fit of system to environment.”

Page: 202 when they say that thought, melodies, and harmonies had poured in upon them, and that they had simply retained the right ones.”

Page: 202 cards known as Oblique Strategies

Page: 203 Being randomly jittered, thrown out of the frame and focused on a larger scale, provides a way to leave what might

Page: 203 be locally good and get back to the pursuit of what might be globally optimal.

Page: 204 First, from Hill Climbing: even if you’re in the habit of sometimes acting on bad ideas, you should always act on good ones. Second, from the Metropolis Algorithm: your likelihood of following a bad idea should be inversely proportional

Page: 204 to how bad an idea it is. Third, from Simulated Annealing: you should front-load randomness, rapidly cooling out of a totally random state, using ever less and less randomness as time goes on, lingering longest as you approach freezing. Temper yourself—literally.

Page: 204 “Once you got somewhere you were happy,” he told the Guardian, “you’d be stupid to shake it up any further.”

Page: 206 The foundation of human connection is protocol—a shared convention of procedures and expectations,

Page: 206 Greek protokollon, “first glue,” which referred to the outer page attached to a book or manuscript.

Page: 206 Most of our communication technology—from the telegraph to the text—has merely provided us with new conduits to experience these familiar person-to-person challenges.

Page: 206 Packet Switching

Page: 208 Paul Baran at the RAND Corporation was trying to solve the problem of network robustness,

Page: 209 Acknowledgment

Page: 209 “Byzantine generals problem.”

Page: 210 packet delivery is confirmed by what are called acknowledgment packets, or ACKs.

Page: 212 Exponential Backoff: The Algorithm of Forgiveness

Page: 214 Exponential Backoff.

Page: 216 Flow Control and Congestion Avoidance

Page: 217 Additive Increase, Multiplicative Decrease,

Page: 219 “Every public servant should be demoted to the immediately lower rank,” he wrote, “because they were advanced until they became incompetent.”

Page: 219 in an unpredictable and changing environment, pushing things to the point of failure is indeed sometimes the best (or the only) way to use all the resources to their fullest.

Page: 220 What matters is making sure that the response to failure is both sharp and resilient.

Page: 221 With poor feedback, they discovered, the story falls apart.

Page: 221 In fact, it’s now clear that the cause and effect are often the reverse: a poor listener destroys the tale.

Page: 225 In real life, packet loss is almost total.

Page: 226 It used to be that people knocked on your door, got no response, and went away. Now they’re effectively waiting in line when you come home.

Page: 226 We used to reject; now we defer.

Page: 226 The much-lamented “lack of idleness” one reads about is, perversely, the primary feature of buffers:

Page: 227 “A generalization I take away from this whole experience is that engineers should think about time as a first-class citizen.”

Page: 229 Optimal stopping problems spring from the irreversibility and irrevocability of time; the explore/exploit dilemma, from time’s limited supply. Relaxation and randomization emerge as vital and necessary strategies for dealing with the ineluctable complexity of challenges like trip planning and vaccinations.

Page: 230 “successful investing is anticipating the anticipations of others.”

Page: 231 Simply put, any time a system—be it a machine or a mind—simulates the workings of something as complex as itself, it finds its resources totally maxed out, more or less by definition.

Page: 234 Put broadly, the object of study in mathematics is truth; the object of study in computer science is complexity.

Page: 240 what rules will give us the behavior we want to see?

Page: 242 Mechanism design makes a powerful argument for the need for a designer—be it a CEO, a contract binding all parties, or a don who enforces omertà by garroted carotid.

Page: 242 Scaling up this logic results in a potent argument for the role of government.

Page: 245 Nature is full of examples of individuals being essentially hijacked to serve the goals of another species. The lancet liver fluke (Dicrocoelium dendriticum), for instance, is a parasite that makes ants deliberately climb to the tops of grass blades so that they’ll be eaten by sheep—the lancet fluke’s preferred host. Likewise, the parasite Toxoplasma gondii makes mice permanently lose their fear of cats, with similar results.

Page: 245 “Morality is herd instinct in the individual,” wrote Nietzsche.

Page: 245 “If people expect us to respond irrationally to the theft of our property, we will seldom need to, because it will not be in their interests to steal it.

Page: 245 knowing that the other party (be it spouse or landlord) is in turn prepared to jump ship would prevent many of the long-term investments (having children together,

Page: 246 you need a feeling that makes you not want to separate,

Page: 246 Happiness is the lock.

Page: 246 So the rational argument for love is twofold: the emotions of attachment not only spare you from recursively overthinking your partner’s intentions, but by changing the payoffs actually enable a better outcome altogether. What’s more, being able to fall involuntarily in love makes you, in turn, a more attractive partner to have. Your capacity for heartbreak, for sleeping with the emotional fishes, is the very quality that makes you such a trusty accomplice.

Page: 247 Whenever you find yourself on the side of the majority, it is time to pause and reflect. —MARK TWAIN

Page: 247 Fads and fashions are the result of following others’ behavior without being anchored to any underlying objective truth about the world.

Page: 248 “sealed-bid first-price auction,”

Page: 248 “Dutch auction” or “descending auction,”

Page: 248 “English auction” or “ascending auction”—

Page: 249 “information cascade.”

Page: 250 “Something very important happens once somebody decides to follow blindly his predecessors independently of his own information signal, and that is that his action becomes uninformative to all later decision makers.

Page: 251 Information cascades offer a rational theory not only of bubbles, but also of fads and herd behavior more generally.

Page: 251 When you’re mostly looking to others to set a course, they may well be looking right back at you to do the same.

Page: 251 actions are not beliefs; cascades get caused in part when we misinterpret what others think based on what they do.

Page: 252 Vickrey auction,

Page: 252 in a Vickrey auction, the winner ends up paying not the amount of their own bid, but that of the second-place bidder.

Page: 253 finding called the “revelation principle,” Nobel laureate Roger Myerson proved that any game that requires strategically masking the truth can be transformed into a game that requires nothing but simple honesty.

Page: 254 “The basic thing is if you don’t want your clients to optimize against you, you’d better optimize for them. That’s the whole proof.… If I design an algorithm that already optimizes for you, there is nothing you can do.”

Page: 255 Adopting a strategy that doesn’t require anticipating, predicting, reading into, or changing course because of the tactics of others is one way to cut the Gordian knot of recursion. And sometimes that strategy is not just easy—it’s optimal.

Page: 255 If changing strategies doesn’t help, you can try to change the game. And if that’s not possible, you can at least exercise some control about which games you choose to play.

Page: 256 I firmly believe that the important things about humans are social in character and that relief by machines from many of our present demanding intellectual functions will finally give the human race time and incentive to learn how to live well together. —MERRILL FLOOD

Page: 256 Any dynamic system subject to the constraints of space and time is up against a core set of fundamental and unavoidable problems.

Page: 256 good algorithmic approaches that can simply be transferred over to human problems. The 37% Rule, the Least Recently Used criterion

Page: 256 for handling overflowing caches, and the Upper Confidence Bound as a guide to exploration are all examples of this.

Page: 256 If you followed the best possible process, then you’ve done all you can,

Page: 256 We can hope to be fortunate—but we should strive to be wise. Call it a kind of computational Stoicism.

Page: 256 If you wind up stuck in an intractable scenario, remember that heuristics, approximations, and strategic use of randomness can help you find workable solutions.

Page: 256 Our interviewees were on average more likely to be available when we requested a meeting, say, “next Tuesday between 1:00 and 2:00 p.m. PST” than “at a convenient time this coming week.”

Page: 256 people preferred receiving a constrained problem,

Page: 256 complexity gap between “verification” and “search”—which is about as wide as the gap between knowing a good song when you hear it and writing one on the spot.

Page: 256 When we interact with other people, we present them with computational problems—

Page: 256 It has the veneer of kindness about it, but it does two deeply alarming things. First, it passes the cognitive buck: “Here’s a problem, you handle it.”

Page: 256 by not stating your preferences, it invites the others to simulate or imagine them.

Page: 256 simulation of the minds of others is one of the biggest computational challenges a mind (or machine) can ever face.

Page: 256 One of the chief goals of design ought to be protecting people from unnecessary tension, friction, and mental labor.

Page: 256 Urban planners and architects routinely weigh how different lot designs will use resources such as limited space, materials, and money. But they rarely account for the way their designs tax the computational resources of the people who use them.

Page: 256 the best algorithms are all about doing what makes the most sense in the least amount of time,

Page: 256 Up against such hard cases, effective algorithms make assumptions, show a bias toward simpler solutions, trade off the costs of error against the costs of delay, and take chances.

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search