What Data Actually Predicts a Roof Replacement (A Roofer's Field Model)

Emily Crawford, Home Maintenance Editor·Jun 24, 2026·30 min readRoofing Lead Generation

On this page

Ask ten roofers what tells them a house is going to need a new roof and you'll get ten answers: the age, the curl in the shingles, the hail map, the year the neighborhood was built, a gut feeling from the truck. Every one of those holds a piece of the truth. None of them is enough alone. And when you're trying to decide which 400 doors out of 4,000 your crew should knock this month, gut feeling doesn't scale.

So the real question isn't "how can I tell this one roof is old" — a ladder answers that. The question is: across a whole area you've never set foot on, what measurable data actually moves the odds that a given address is due for replacement, and how much does each piece move them? Because that's the difference between a list that closes and a list that wastes a season of payroll, gas, and postage on roofs that aren't ready.

What follows is the honest version of that answer. Which data predicts a replacement, ranked by how much signal it carries. Why some of the data roofers swear by is weaker than they think, and why some they ignore is stronger. A scoring model you can build in a spreadsheet this week. Worked examples on real-looking addresses. The edge cases that quietly poison a naive list. And a straight accounting of where prediction stops and where you still have to put a human on a roof.

One thing up front, because it shapes everything below. Nothing here predicts a replacement the way a thermometer predicts a temperature. You are not getting certainty. You are getting odds — a way to sort thousands of homes so the ones most likely to be due float to the top and the ones least likely sink. A good model doesn't tell you "this roof is dead." It tells you "knock these 400 first, and you'll find far more real jobs per hour than if you knocked at random." That's the whole game. Treat any tool or salesperson who promises more than odds as a red flag.

The two questions every replacement answers

Before the data, get the logic straight, because it's the thing most prospecting lists get wrong.

A roof gets replaced for one of two reasons, and they're almost completely separate:

It wore out. The material reached the end of its service life. Granules gone, mat brittle, seals failed, leaks starting. This is a function of age, material, install quality, sun, and ventilation. It happens whether or not a storm ever comes.
Something broke it. A hail or wind event damaged it badly enough that it needs replacing now, regardless of how much life it had left. A 9-year-old roof can be totaled by one bad hailstorm. A 28-year-old roof can sit untouched for years because nothing ever hit it.

These two paths demand different data. The wear-out path is predicted by age and the things that age a roof faster or slower. The broken path is predicted by storm exposure modeled at the roof, not at the ZIP code. A list built only on age misses every storm total. A list built only on hail maps misses every roof quietly aging into failure on a calm street. The strongest predictor of a replacement is the combination — a roof that is both old enough to be vulnerable and exposed to events that hit roofs like it.

Keep these two columns separate in your head and in your spreadsheet. We'll score them separately and then combine them, because a roof can light up on either path or both.

There's a third thing people lump in here that isn't a replacement predictor at all, and it's worth naming so you don't waste a column on it: repairability. A roof can have a problem — a few lifted shingles, a failed pipe boot, a small leak — and be a repair, not a replacement. The data that predicts a repair (isolated damage on an otherwise sound roof) is almost the opposite of the data that predicts a replacement (whole-roof wear or widespread storm damage). When you're building a list to find replacements, you want roofs where the failure is systemic, not spot. Age and material drive systemic failure; a single visible defect on a young roof usually doesn't. We're scoring for replacements, so weight accordingly and don't let one dramatic-looking defect on a 6-year-old roof pull it onto your A-list.

The data, ranked by how much it predicts

Here's the full field of predictors, sorted into tiers by how much each one actually moves the odds. The tiering matters more than the list. Plenty of roofers over-weight a weak signal because it's easy to get, and under-weight a strong one because it's harder.

Tier 1 — the predictors that carry the most weight

Roof age (as a range, never a date). This is the single most predictive piece of data you can have, full stop. A roof's odds of needing replacement climb steeply as it crosses the back half of its rated life. The catch is that you almost never get an exact install date for a house you've never touched — and the dates you can find are usually wrong (more on that below). What you can get, from aerial imagery compared across years and from neighborhood build patterns, is an age range — "this roof was last redone roughly 16 to 22 years ago." A tight range is enormously useful. A false exact date is worse than nothing because it makes you confident about something untrue.

Material type. A 20-year-old roof means something completely different depending on what's up there. Three-tab asphalt rated for 20 years is at the end of its life. Architectural asphalt rated 30 has a decade left. Standing-seam metal or tile at 20 years is barely middle-aged. Material sets the clock that age runs against. You can often read material from aerial imagery (texture, panel lines, tile shadows) well enough to bucket it, and that bucket changes the entire age calculation.

Per-roof storm exposure. Not "did it hail in this ZIP." Did hail of a damaging size, driven by wind from a damaging direction, actually strike this roof — given its slope, orientation, and the path the storm took? A hail map shows you a county-wide blob. What predicts a replacement is exposure modeled down to the individual roof, because two houses three streets apart can have wildly different exposure from the same storm. This is the predictor most lists get laziest about, and it's a Tier 1 signal when it's done at the roof and a near-useless one when it's done at the ZIP.

Tier 2 — strong supporting predictors

Local climate and sun load. The same shingle dies years sooner in Phoenix than in Portland. UV, heat cycling, and thermal expansion grind asphalt down. A south- and west-facing slope in a high-UV market ages measurably faster than a north slope two states north. This doesn't predict on its own, but it shifts the age range you apply to a whole market and to individual slopes.

Roof geometry and pitch. Low-slope sections, complex valleys, and lots of penetrations fail sooner and leak earlier than simple high-pitched gable roofs. Geometry is readable from imagery and it changes how much usable life a given age implies.

Prior permit history. A re-roof permit on file resets the age clock — and the absence of any roof permit on an old house, in a jurisdiction that requires them, is itself a signal that the original roof may still be up there. Permit data is patchy and varies wildly by county, but where it exists it's a strong correction on your age estimate.

Tree cover and exposure. Heavy shade slows UV aging but adds debris, moisture, moss, and overhang damage. Open, unshaded roofs age faster from sun but take wind and hail full-on. Either way it's information, and it's visible from above.

Tier 3 — weak signals, false friends, and over-rated data

Year the house was built. This is the predictor roofers and homeowners trust most and it is one of the weakest for this purpose, because a re-roof is invisible to it. A house built in 1994 may have had a brand-new roof in 2019. The build year tells you the roof's maximum possible age, nothing about its actual age. It's a ceiling, not an estimate. Useful as one input; dangerous as the whole basis of a list.

Zestimate-style "roof age" fields and public property records. Most online property data either doesn't have a real roof age or quietly back-fills it from the build year. Treat any "roof age" you didn't derive from imagery or a permit as suspect.

Homeowner tenure / recent sale. A house that sold recently sometimes got a pre-sale roof; a long-tenured owner is more likely to have an aging original. Mild signal, easily wrong, fine as a tiebreaker.

Income or home value alone. Predicts ability to pay, not whether the roof is due. A useful filter on a list, not a predictor of need. Don't confuse the two.

Curbside visual signs. Curl, granule loss in the gutters, a patchy or stained surface — these are genuinely strong on the one roof you're standing in front of. But they're not data you have across thousands of addresses before you knock, so they belong in the field-confirmation step, not the list-building step. Don't skip them; just know they come later.

Here's the field collapsed into one table you can pin above your desk.

Data point	What it predicts	Signal strength	How you get it at scale	Main trap
Roof age (range)	Wear-out odds	Very high	Aerial imagery over time, build patterns	Treating a guess as an exact date
Material type	Resets the life clock	High	Aerial imagery, neighborhood norms	Misreading texture from above
Per-roof storm exposure	Damage / total odds	Very high (per-roof), low (per-ZIP)	Modeled hail/wind at the address	Using a county hail map as the whole answer
Climate / sun load	Shifts the age curve	Medium	Market data, slope orientation	Applying one national lifespan everywhere
Roof geometry / pitch	Early-failure odds	Medium	Aerial imagery	Ignoring low-slope sections
Permit history	Corrects age estimate	Medium-high where it exists	County records	Patchy, inconsistent coverage
Tree cover	Modifies aging	Low-medium	Aerial imagery	Reading it as decisive
Year built	Max possible age only	Low	Public records	Re-roofs are invisible to it
Public "roof age" field	Usually none / back-filled	Very low	Property data sites	Trusting a number that's really the build year
Owner tenure / value	Pay ability, not need	Low	Property data	Confusing afford with need

Why the data roofers trust most is often the weakest

It's worth dwelling on the Tier 3 traps, because every one of them has sunk a real prospecting list.

Build year is the big one. Pull any neighborhood and you can get the year every house was built in about ten minutes. It feels like roof data. It isn't. In a subdivision built in 1998, the original 25-year roofs started aging out around 2018 to 2023 — which means by now a sizable chunk have already been redone. If you mail the whole 1998 subdivision as "25-year-old roofs," you'll be knocking a meaningful share of houses with five-year-old roofs and burning your credibility at the door. The build year was right. Your conclusion was wrong, because the re-roofs are invisible to it.

Public property records and "roof age" fields fail the same way, just one step removed. When a data source shows a roof age, ask where it came from. If it can't tell you it was derived from imagery or a permit, assume it's the build year wearing a costume.

County hail maps are the storm-side equivalent. A map showing a hail swath over your county tells you a storm happened somewhere near these people. It does not tell you which roofs took damaging stones at a damaging angle. Plenty of roofers have canvassed a whole "hail-hit" ZIP, found nothing on three streets, assumed the storm was a dud, and walked away from the two streets that actually got hammered. The map wasn't wrong. It was just too coarse to predict anything at the door.

The pattern across all three: the easy data tells you about the area or the house, and you need data about the roof. The gap between those is where wasted seasons live.

How to read material and age off an aerial image

Since two of your three Tier 1 predictors — material and age — come from imagery, it's worth being precise about what's actually readable from above and what isn't, because this is where homegrown lists pick up half their errors.

Reading material. You're bucketing, not certifying. From a clear overhead or oblique image:

3-tab asphalt reads flat and uniform, with thin horizontal lines and no shadow depth. It looks like a printed grid.
Architectural (dimensional) asphalt reads with visible shadow lines and a broken, staggered texture — depth you can see when the sun is low in the image.
Metal shows long straight panel seams running up the slope, often a slightly reflective or cooler tone, and very clean edges.
Tile shows repeating rounded or ribbed rows with strong individual shadows — almost a corduroy look — and wood shake reads as irregular, broken, often grayed coursing without tile's regularity.

The common misreads: weathered architectural can flatten out and read as 3-tab in a low-resolution image (you'll under-estimate its rated life), and a faded metal roof on a low slope can read as something else entirely. When the read is ambiguous, default to the neighborhood norm — subdivisions are built in batches and the houses around an ambiguous roof usually tell you what's on it.

Reading age. You're not finding an install date; you're finding evidence of change over time and current weathering. Two methods stack:

Multi-year comparison. Pull imagery from several years apart for the same roof. A surface that suddenly went from weathered-gray to fresh-black between two image dates is a re-roof, and now you have a tight upper bound on its real age. A surface that looks the same across many years is an old roof that hasn't been touched — and the consistency itself is your evidence.
Current-state weathering. On the most recent image, granule loss reads as lighter, mottled, blotchy patches; streaking and biological growth read as dark vertical stains; and a uniform deep-black surface usually reads as recent. None of these gives you a number, but they push your range younger or older.

Combine the two with the neighborhood build year as the ceiling, and you get a defensible range. The discipline that separates a good estimate from a bad one is simple: always carry the range, never collapse it to a point. "18 to 23 years, no re-roof visible across the imagery" is a sentence you can stand behind at a door. "21 years old" is a sentence that will eventually embarrass you.

Building a scoring model you can actually run

Now the practical part. You don't need a data-science team to turn this into a ranked list. You need a spreadsheet and a disciplined way to combine the predictors. We'll build the two-path logic from the top of this piece into a simple additive score, then refine it.

Step 1 — set your wear-out baseline by material

Start with a rated service life per material for your market. These are working baselines, not gospel — adjust for your climate. In a high-UV southern market, pull the asphalt numbers down a few years; in a mild northern one, nudge them up.

Material	Working service-life range	Notes
3-tab asphalt	15 to 20 years	The most common wear-out target
Architectural / dimensional asphalt	22 to 30 years	Reads thicker/shadowed from above
Wood shake	20 to 30 years	High maintenance, climate-sensitive
Metal (standing seam)	40 to 60 years	Rarely a wear-out prospect
Tile / slate	50+ years	Underlayment fails before tile does

Step 2 — score the wear-out path (0 to 50)

For each address, take your estimated age range and where its midpoint falls inside the material's service life. Score how far through its life the roof is:

Under 50% of rated life: 0 to 10 points. Plenty of life left. Mostly skip.
50 to 75% of rated life: 15 to 25 points. Worth watching, not yet prime.
75 to 90% of rated life: 30 to 40 points. Prime targets. The meat of your list.
Over 90% of rated life, or past it: 45 to 50 points. Overdue. Knock first.

Then apply modifiers (a few points each): + for high-UV market, south/west exposure, low slope, no roof permit on an old house, complex geometry; − for shade (UV only), a recent re-roof permit, simple high-pitch geometry, mild climate.

Step 3 — score the storm path (0 to 50)

This is where most homegrown models stall, because per-roof exposure is genuinely hard to compute by hand. If all you have is a county hail map, the most honest thing you can do is score it coarsely and label it low-confidence:

No damaging event on record near the roof: 0 points.
Marginal event in the broad area (small hail, moderate wind): 10 to 20 points, low confidence.
Significant event whose path plausibly crossed this roof: 25 to 40 points.
Strong, recent event modeled to this specific roof's orientation and slope: 45 to 50 points, high confidence.

The honest truth: a hand-built model can do Steps 1 and 2 well and Step 3 only crudely. Per-roof storm modeling is the part that's worth getting from a tool built for it, and we'll get to that.

Step 4 — combine, but don't just add

Don't simply sum the two paths, because a roof can qualify on either one. A 28-year-old 3-tab roof with no storm history is a great prospect (high wear-out, zero storm). A 9-year-old roof that just took a direct, severe hail event is also a great prospect (low wear-out, high storm). Naive addition would rank both as "medium," which is exactly wrong.

Use the higher of the two paths as the floor, then add a fraction of the other:

Score = max(WearOut, Storm) + 0.4 × min(WearOut, Storm)

That way a strong signal on either path floats the address up, and a roof that lights up on both (old and storm-hit) tops the list — which is correct, because that's the highest-odds roof there is.

Step 5 — sort, band, and route

Rank every address by combined score and cut into bands:

80 to 100 — Knock and mail first. Highest odds. Your A-list.
55 to 79 — Second wave. Solid, work after the A-list.
30 to 54 — Watch / nurture. Worth a mailer, not your crew's feet.
Under 30 — Skip for now. New roofs and untouched roofs. Don't waste payroll here.

That banding is the entire point of the exercise: it turns 4,000 undifferentiated doors into a route that finds real jobs per hour at a rate random knocking can't touch.

A worked example: three houses on the same street

Numbers make this concrete. Same block, same 1999 subdivision, same storm history on the county map. Watch how the per-roof data splits them.

House A — 1414 Maple. Aerial imagery across the last several years shows the same weathered surface throughout: no re-roof visible. Build year 1999, no roof permit on file. Material reads as 3-tab asphalt (flat texture, no shadow lines). Age range: roughly 22 to 25 years. That's well past the 15-to-20-year service life for 3-tab.

Wear-out: past 90% of life → 48. South-facing main slope, high-UV market → +4. No permit on a 25-year-old roof → +3. Wear-out path ≈ 50 (capped).
Storm: county shows one moderate hail event two years ago; no per-roof modeling available → 15, low confidence.
Combined: 50 + 0.4 × 15 = 56. Solidly A-list on age alone.

House B — 1418 Maple. Imagery from three years ago shows a fresh, uniform surface that wasn't there before — a clear re-roof. Permit record confirms a 2021 re-roof. Material now reads architectural (shadow lines).

Wear-out: ~3-year-old 30-year roof, under 50% of life → 5.
Storm: same county event → 15, low confidence.
Combined: 15 + 0.4 × 5 = 17. Skip. Knocking this door wastes your rep's time and tells the homeowner you didn't do your homework.

House C — 1422 Maple. Imagery shows an aging but not ancient surface; build year 1999 but a 2014 re-roof permit on file. Material architectural. Age range ~10 to 12 years — middle-aged. But this roof sits on the windward edge of the block, and a severe hail core actually tracked across this row last spring.

Wear-out: ~11-year-old 30-year roof, ~37% of life → 8.
Storm: a strong, recent event modeled to this roof's exposed orientation → 45, high confidence.
Combined: 45 + 0.4 × 8 = 48. Second-wave on score, but if your storm confidence is high this is a knock-now door — the kind a pure age list would have skipped entirely.

Three houses, one street, one county hail map, and three completely different answers. That's the value of moving from area data to roof data. House B is the one a build-year list mails by mistake. House C is the one a hail-map-only list misses. Only the combined, per-roof view gets all three right.

The math behind why ranking beats blanketing

It's worth seeing in numbers why this ranking discipline pays, because "work the right doors" sounds like a slogan until you cost it out.

Say your crew can physically knock 1,000 doors in a working window, and in a typical mixed neighborhood roughly 1 in 12 doors hides a roof that's genuinely due — call it an 8% base rate of real prospects if you knock at random. Knock 1,000 random doors and you stand in front of about 80 real prospects, scattered, with a lot of new-roof and not-yet-ready doors in between killing your reps' momentum and morale.

Now rank those same doors and knock only the top band. If your A-list concentrates real prospects at, say, 1 in 3 instead of 1 in 12 — a concentration that's very achievable when age and per-roof storm data are both feeding the score — then 1,000 knocks puts you in front of around 330 real prospects instead of 80. Same crew, same hours, same gas. Four times the at-bats against roofs that are actually due. Even if your real concentration lands at half that, you've roughly doubled productive contacts. That multiplier is the entire economic case for prediction data, and it's also why a wrong model is expensive: a list that feels ranked but isn't just sends your crew back to the 8% base rate while you pay for the illusion of targeting.

The same logic runs your mail. If a mail drop costs you a fixed amount per piece, the cost per real prospect reached falls in direct proportion to how well your list concentrates them. Cutting a 4,000-piece blanket drop down to the 1,200 highest-odds addresses, at the same response rate among real prospects, lands nearly the same number of jobs for well under half the postage. The savings aren't from mailing less for its own sake; they're from not paying to reach roofs that were never going to need you this year.

Getting the hard data: where per-roof modeling comes in

You can build Steps 1, 2, 4, and 5 yourself with imagery, permit records, and a spreadsheet. The piece that's genuinely hard to do by hand is Step 3 — turning a county storm event into a per-roof exposure score, and deriving a defensible age range from imagery across thousands of addresses at once. That's exactly the gap RoofPredict was built to fill.

The approach is the two-path logic from the top of this piece, run at scale. For an area you choose, it estimates roof age as a range per address from aerial imagery, and it models storm physics — hail and wind — at the individual roof rather than smearing a county map across every house. "We model the storm on each roof, not only where it passed" is the plain version: a hail map shows you where it hailed; this shows which roofs the storm actually wore out, scored against how old each one already was. The output is the ranked, banded list this whole method produces — which doors are most likely due, house by house — so your crew knocks and your mail lands on the roofs that earn it and skips the ones that don't.

Honest limits, because that's the only way to use any tool well:

It gives you odds, not proof. A high score means high likelihood, not a guaranteed job. You still confirm on the roof. The forecast ranks your list; it doesn't replace an inspection or a homeowner's decision.
Roof age is a range, not a date. From the air you can bracket when a roof was last done; you cannot read an install certificate. Anyone selling you exact install dates from imagery is overselling.
It's not a lead-buying service. Nobody hands you a homeowner who raised their hand. It sharpens the outbound you already do — it ranks your streets and your old customer list so the same crew, mail, and payroll find more real jobs.
It doesn't touch the insurance side. If a roof turns out to be storm-damaged, you document conditions and write your estimate; the insurer decides coverage and the homeowner owns their claim. The data points your crew at the right roofs. It says nothing about deductibles or whether a claim gets paid.

Used for what it is — a way to compute the one predictor that's hard to compute by hand, and to do the rest at a scale a spreadsheet can't — it closes the exact gap homegrown models leave open.

Your own customer book is the most predictive data you already own

One data source gets overlooked because it's not flashy: your CRM. The roofs you inspected or estimated three, five, seven years ago and didn't close are now older by exactly that many years — and you already have the address, the homeowner's name, and often a note on what the roof looked like. That's a prediction dataset most shops are sitting on without scoring it.

Run your old estimates and past customers through the same two-path logic. An estimate you wrote in 2019 on a roof you judged to be "near end of life" is, by definition, well past end of life now. A homeowner who declined a repair five years ago because the roof "had a little life left" is a different conversation today. Layer per-roof storm exposure on top of those addresses and the dead ones light back up. This costs you nothing in list acquisition — the money is already in your book — and it converts better than cold data because there's a prior relationship. When you score an area, score your own history alongside it. It's frequently the highest-odds, lowest-cost segment you have.

Validating your model before you trust it

A scoring model that nobody checks is just a confident way to be wrong. Build a feedback loop from day one.

Backtest on jobs you already won. Take your last 50 to 100 signed replacements. Run their addresses through your model as if you hadn't knocked them yet. What scores did they get? If most of your real, closed replacements land in the 55-plus bands, your model has signal. If they're scattered evenly across all bands, your weights are off — usually because you're over-trusting build year or a coarse hail map. This single exercise is the most valuable thing you can do, and almost nobody does it.

Track hit rate by band in the field. As crews work the list, log outcomes by band: knocks, inspections, signed jobs. Within a few weeks you'll see whether your A-list (80-plus) really converts better than your second wave, and by how much. If it doesn't, the bands are mislabeled and need retuning.

Watch the false-positive patterns. When an A-list door turns out to have a five-year-old roof, find out why your model missed it. Almost always it's an invisible re-roof the imagery or permit data didn't catch. Note the pattern; it tells you where to tighten.

Re-pull, don't set and forget. Roofs age, storms hit, re-roofs happen. A list built 18 months ago has drifted. The roofs that were at 70% of life are now at 80%; new storms have rewritten the storm path entirely. Refresh on a cadence that matches your market — more often in active storm regions.

Edge cases that quietly poison a list

Every scoring model meets reality and reality has exceptions. These are the ones that do the most damage if you don't account for them.

The partial re-roof. A homeowner who only replaced the storm-facing slope after a prior claim leaves you a roof that's half new and half ancient. Imagery of the wrong slope tells you the wrong story either way. Where you can see both slopes, treat mixed surfaces as a flag for field confirmation rather than a confident score.

The overlay. A roof shingled directly over the old one (a second layer) reads as a newer surface from above but is structurally a roof on borrowed time and often near a tear-off anyway. Imagery can't see the layer count. In older housing stock where overlays were common, nudge your confidence down and lean on the curbside read, where a thick, lumpy edge at the eaves gives it away.

The premium roof that outlives its rating. A high-end architectural or metal roof installed by a meticulous contractor with great ventilation can run well past its book number. If your model keeps scoring a 35-year-old roof as overdue and the field keeps finding it sound, that neighborhood's install quality is beating your baseline. Adjust the baseline for that pocket rather than fighting it door by door.

The cheap roof that dies early. The mirror image: a builder-grade roof in a high-UV market, badly ventilated, can be cooked at 12 years. If your field outcomes keep finding dead roofs you scored as mid-life, pull your service-life baseline down for that material in that market.

The new construction wave. A subdivision finishing this year is a guaranteed skip for a decade — but it'll surface on a build-year pull as "no roof data," and a careless model might score the unknown as a maybe. Hard-zero anything with imagery showing a fresh roof, regardless of what the property record says.

The storm that hit and got fully fixed. A neighborhood that took a severe event two years ago and was largely re-roofed under claims is now, paradoxically, one of the worst prospects in your area — full of two-year-old roofs — even though the storm data still flags it hot. This is the single most common reason a storm-only list flops. Your imagery has to override your storm history here: if the roofs are visibly new, the storm exposure is already spent. Score the roof you see, not the storm you remember.

A repeatable workflow, start to finish

Put it together as a process your team can run every cycle:

Pick the area. A set of neighborhoods, a radius, or your own past-customer and old-estimate list (which is gold — those people already know you).
Pull the raw predictors. Age range from imagery, material read, geometry, permit history where available, and per-roof storm exposure.
Score both paths — wear-out (0 to 50) and storm (0 to 50) — for every address.
Combine with the max-plus-fraction formula so either path can float an address up.
Band and route into Knock-first / Second-wave / Nurture / Skip.
Confirm in the field. Curbside read and, where warranted, a roof inspection on the high-band doors. This is where the curbside signals from Tier 3 finally earn their keep.
Log every outcome by band and feed it back into your weights.
Refresh on a cadence. Re-pull, re-score, re-route.

Notice that the model never makes the final call. It produces odds and a route. A human still gets on the roof, and the homeowner still decides. The model's job is to make sure that human is standing in front of the right house far more often than chance would put them there.

A note on confidence, and saying it out loud

One habit separates teams that trust their data from teams that quietly stop using it: scoring the confidence of each prediction alongside the score itself. A roof you scored 85 off clean multi-year imagery and a high-confidence per-roof storm model is a different animal from a roof you scored 85 off a single blurry image and a county hail blob. Same number, very different bet.

Carry a confidence flag — high, medium, low — next to every score, and route accordingly. High-confidence high-score doors are where you send your best closer first. Low-confidence high-score doors are where you send a rep to do a quick curbside read before you commit a full inspection slot. This costs you one extra column and it stops your team from treating a coin-flip and a near-sure-thing as the same door. It also makes your backtest sharper: when a prediction misses, you can see whether it was a high-confidence miss (a real model problem to fix) or a low-confidence one (expected noise you'd already flagged).

The broader principle, which is the through-line of everything here: prediction data earns trust by being honest about what it doesn't know. A model that brands every door a sure thing gets ignored the first time a rep finds a new roof behind an 85. A model that says "high odds, medium confidence, go look" survives contact with the field, because the field confirms what it claimed and forgives what it hedged.

What pros get wrong

A few failure patterns show up again and again, even among shops that should know better.

They build a list on a single predictor. Usually build year, sometimes a hail map. Either one alone produces a list that's wrong in predictable, expensive ways — mailing re-roofed houses, skipping storm-hit middle-aged roofs. The combination is the whole point.

They treat an age estimate as a date. A range honestly stated is a tool. A false exact date is a liability — it makes you certain at the door about something you guessed, and homeowners can smell that.

They use the county hail map as the storm answer. It's a starting filter, not a per-roof predictor. The roofs that actually got hit live at a finer resolution than any county map shows.

They never backtest. They trust the model because it feels right, never check it against jobs they actually won, and never learn that their weights are off.

They confuse can-pay with needs-it. A wealthy neighborhood with new roofs is a worse list than a modest one full of 22-year-old 3-tab. Value data filters; it doesn't predict need.

They forget the field-confirm step. The model ranks; it doesn't diagnose. Skipping the curbside-and-ladder confirmation on high-band doors turns good odds into bad knocks. The data gets you to the right street; your eyes and the homeowner's roof close the loop.

The honest bottom line

The data that predicts a roof replacement isn't one number. It's a roof old enough to be vulnerable, made of a material that sets the clock, aged faster or slower by sun and slope and shade, and exposed — or not — to a storm modeled at that roof rather than smeared across a county. Age and per-roof storm exposure are the heavy hitters. Material and climate and geometry tune them. Build year and public "roof age" fields are the seductive weak signals that wreck naive lists by missing every re-roof.

Combine the two paths so either one can float an address up, band the results, confirm in the field, and check your model against the jobs you actually win. Do that and you stop knocking at random. You start standing in front of the roofs the data says are most likely due — which is the closest thing prospecting has to a sure bet, as long as you remember it's odds, not certainty, all the way down.

FAQ

What single piece of data best predicts a roof replacement?

Roof age, expressed as a range rather than an exact date, carries the most predictive weight on its own. The odds of replacement climb steeply once a roof crosses the back half of its rated service life. But age has to be read against material type, and it misses every storm-totaled roof, so the strongest prediction comes from pairing age with per-roof storm exposure rather than relying on any one number.

Why isn't the year a house was built a good predictor of roof age?

Because a re-roof is invisible to it. The build year only tells you the roof's maximum possible age, not its actual age. A house built in 1998 may have been completely re-roofed in 2020. Build a list on build year alone and you will mail and knock houses with nearly new roofs, which wastes budget and hurts your credibility at the door. Use it as a ceiling, never as the estimate.

Can a hail map tell me which roofs need replacing?

Only very coarsely. A county or regional hail map shows that a storm happened somewhere near a group of homes; it doesn't show which individual roofs took damaging stones at a damaging angle. Two houses a few streets apart can have completely different exposure from the same storm. To predict damage at the door you need storm exposure modeled at the individual roof, accounting for its slope, orientation, and the storm's actual path.

How accurate can a roof age estimate from aerial imagery be?

Accurate enough to be useful as a range, never as an exact date. By comparing imagery across years and reading neighborhood build patterns, you can bracket when a roof was last redone, for example 16 to 22 years ago. That range is genuinely valuable for ranking a list. What you cannot do from the air is read an install certificate, so treat any tool promising exact install dates from imagery as overselling.

How do I combine roof age and storm data into one score?

Don't just add them, because a roof can qualify on either path independently. Score the wear-out path (age, material, climate) and the storm path (per-roof exposure) separately, each 0 to 50, then use the higher of the two as a floor and add a fraction of the lower: Score = max(WearOut, Storm) + 0.4 × min(WearOut, Storm). That way an old roof with no storm and a young roof with a direct hit both rank high, and a roof that is both old and storm-hit tops the list.

Does a roof replacement prediction tool guarantee I'll find a job at that address?

No, and any tool that claims otherwise is overselling. Prediction data gives you odds, not proof. A high score means a high likelihood that the roof is due, which is why you knock it first, but you still confirm condition on the roof and the homeowner still decides. The value is statistical: a well-ranked list finds far more real jobs per hour than random knocking, not that any single door is a sure thing.

Is this the same as buying roofing leads?

No. Lead services sell you a homeowner who raised their hand, often resold to several competitors. Replacement-prediction data does something different: it ranks the streets and old-customer lists you already work so your existing crew, mail, and payroll land on the roofs most likely to be due. You own the outreach and the relationship; the data just points you at the right doors instead of every door.

How do I check whether my scoring model actually works?

Backtest it. Take your last 50 to 100 signed replacements and run their addresses through the model as if you hadn't knocked them yet. If most of your real closed jobs land in the high bands, the model has signal. If they're scattered evenly across every band, your weights are off, usually from over-trusting build year or a coarse hail map. Then track real-world hit rate by band in the field and retune.

How does material type change a roof's replacement odds?

Material sets the clock that age runs against. A 20-year-old 3-tab asphalt roof rated for 20 years is at the end of its life; a 20-year-old architectural asphalt roof rated 30 has a decade left; standing-seam metal or tile at 20 years is barely middle-aged. You can usually bucket material from aerial imagery by texture and panel or tile shadows, and that bucket changes the entire age calculation.

How often should I refresh a roof-replacement prospect list?

On a cadence that matches your market, because the data drifts. Roofs age, new storms rewrite the storm path entirely, and re-roofs happen. A list built 18 months ago has roofs that have moved from 70 to 80 percent of life and storm exposure that no longer reflects reality. Refresh more often in active storm regions and at least seasonally elsewhere, re-pulling imagery and storm data and re-scoring.

The Roofline by RoofPredict

Stay Ahead of Roofing Market Changes

Join The Roofline by RoofPredict for weekly roofing intelligence: material price signals, storm demand, insurance and regulatory updates, sales tactics, and local contractor opportunities.

Sources

Asphalt Roofing Manufacturers Association — Asphalt Shingle Performance and Service Life — asphaltroofing.org
National Roofing Contractors Association — Roofing Materials and Systems — nrca.net
Insurance Institute for Business & Home Safety — Hail and Roof Performance Research — ibhs.org
NOAA National Severe Storms Laboratory — Severe Weather 101: Hail — nssl.noaa.gov
NOAA Storm Prediction Center — Severe Weather Event and Hail Reports — spc.noaa.gov
National Weather Service — Thunderstorm and Wind Hazard Information — weather.gov
International Code Council — International Residential Code (Roof Coverings) — codes.iccsafe.org
U.S. Department of Energy — Cool Roofs and Roof Heat / UV Aging — energy.gov
U.S. Census Bureau — Building Permits Survey — census.gov
Federal Trade Commission — Truth in Advertising Guidance for Businesses — ftc.gov
Texas Department of Insurance — Roof Claims and Storm Damage Consumer Guidance — tdi.texas.gov
U.S. Bureau of Labor Statistics — Roofers Occupational Outlook — bls.gov
Occupational Safety and Health Administration — Fall Protection in Roofing Work — osha.gov
RoofPredict — roofpredict.com

Roofing Lead Generation

How to Qualify Roofing Leads Before Scheduling an Inspection

Most wasted inspections come from leads that were never going to buy. Here is the qualification workflow that lets you book only the doors worth a truck roll.

Emily Crawford · Jun 24, 2026 · 30 min read

Roofing Lead Generation

Is Direct Mail Worth It for Roofing Companies? An Honest Numbers Breakdown

Direct mail still works for roofers who target the right roofs and track the right numbers. Here is the full cost-per-job math, the response benchmarks, and where most mail money gets wasted.

Emily Crawford · Jun 24, 2026 · 32 min read

Roofing Lead Generation

How to Lower Cost Per Lead From Roofing Direct Mail

Cost per lead from direct mail is decided mostly before you print — by who's on the list. Here's the real CPL math, the targeting that cuts it, and the tracking that proves it.

Emily Crawford · Jun 24, 2026 · 30 min read

What Data Actually Predicts a Roof Replacement (A Roofer's Field Model)

The two questions every replacement answers

The data, ranked by how much it predicts

Tier 1 — the predictors that carry the most weight

Tier 2 — strong supporting predictors

Tier 3 — weak signals, false friends, and over-rated data

Why the data roofers trust most is often the weakest

How to read material and age off an aerial image

Building a scoring model you can actually run

Step 1 — set your wear-out baseline by material

Step 2 — score the wear-out path (0 to 50)

Step 3 — score the storm path (0 to 50)

Step 4 — combine, but don't just add

Step 5 — sort, band, and route

A worked example: three houses on the same street

The math behind why ranking beats blanketing

Getting the hard data: where per-roof modeling comes in

Your own customer book is the most predictive data you already own

Validating your model before you trust it

Edge cases that quietly poison a list

A repeatable workflow, start to finish

A note on confidence, and saying it out loud

What pros get wrong

The honest bottom line

FAQ

What single piece of data best predicts a roof replacement?

Why isn't the year a house was built a good predictor of roof age?

Can a hail map tell me which roofs need replacing?

How accurate can a roof age estimate from aerial imagery be?

How do I combine roof age and storm data into one score?

Does a roof replacement prediction tool guarantee I'll find a job at that address?

Is this the same as buying roofing leads?

How do I check whether my scoring model actually works?

How does material type change a roof's replacement odds?

How often should I refresh a roof-replacement prospect list?

Stay Ahead of Roofing Market Changes

Sources

Related Articles

How to Qualify Roofing Leads Before Scheduling an Inspection

Is Direct Mail Worth It for Roofing Companies? An Honest Numbers Breakdown

How to Lower Cost Per Lead From Roofing Direct Mail