Data Updates Coming to UD

Noah Rosenblatt

Talking Manhattan on
Staff member
As some of you may know, UD charts have been live for over 12 years after first launching in 2010 to fight the transparency problem with NYC real estate data. It was all about helping buyers, sellers, agents see the market in a more efficient way. Little did we know what black hole of data integrity and aggregation issues we were entering.

We made it through, with proprietary models and filters to cleanse the data, and rid all of the stale, illogical, duplicate, and all the other data integrity monsters that lurked behind dark corners native to local nyc real estate. If you know, you know.
One rule in particular was put in place to count ACTIVE Supply.

The rule was simple and elegant --> REBNY, for years, had a rule in place that required agents to UPDATE their exclusive listings every 14 days. This was to ensure data accuracy and timeliness. Two critical components of data intgerity. This rule was right, it was pure and most agents followed or get locked out of their listing systems until updates were made. Afterall, it takes a few minutes to update even a portfolio of active listings. UD removes listings not updated in 30 days, more than 2x the old rebny rule.

Around January, this rule disappeared as rebny switched and took on full control of the processing of the industry's shared listings from vendors. It was most likely just a victim of a very large project with lots of moving peices. But all I care about is data quality.

So, over the last 6 months, more and more active listings were NOT updated by agents as this rule faded. As time went on, it was clear we needed to act.

In the coming weeks we will be removing the rule that is behind this message on ALL Active Supply charts:
The difference between Actively Updated Supply and ALL Supply is about 1150 units or so - typically the difference is under 500 or so. A clear change since this longtime rebny rule faded.

So, we did the following things in anticpation - mainly doubling down on data integrity:
1. We hired a private data team to manually review and merge unmatched listings to closed sales due to strange data manipulation
2. We applied a few structural fixes to acris closings for preiously unused prop types to match residential sales (ie ,commercial townhouses)
3. We optimized our status flow algos with all the information we learned from the above tasks

We did this work over the past 7 months, since we first detected the issue.

In the coming weeks we will launch all of these fixes at once - and regenerate Active Supply, Pending Sales & Market Pulse Charts.

The result will likely be:
1. Active Supply - HIGHER PRESSURE (now that we will ultimately count stale, unupdated listings)
2. Pending Sales - LOWER PRESSURE (optimized merge with acris closings due to data irregularities)
3. Market Pulse - SHIFT LOWER (with more supply and few pending pulse will shift to a lower level

The TRENDS will all be exactly the same as they are now, its just the absolute #s will be changed because the data differences pre-change existed over time - so we are removing this throughout historical with a regeneration of all datapoints.

While we never like to do data methodology updates, like we are to Active Supply, when there are industry changes we must adapt. This is one of those times. All changes will be announced and spelled out as we roll out these updates. We will continue to invest in data quality and the intepretation of this data for market intelligence.

Noah, John and the UD Data Team

David Goldsmith

All Powerful Moderator
Staff member
REBNY's recent claims regarding "data integrity" are suspect. There has been virtually no enforcement against:
- Pocket listings
- Phoney unit numbers
- Inaccurate room/bedroom counts (using those in "alternate floor plans" rather than actual counts)
- Rejiggering Days on Market

Noah Rosenblatt

Talking Manhattan on
Staff member
Yeah it's a tough task rebny has been given. I can attest, they have many mouths to feed and they are doing what they can to keep data accurate. Intentions are there

It's just a gargantuan task when the data is only as good as the agent that enters it. You need to enforce that agent entering that data.

They have a portal to launch and other tech issues I'm sure, so I can sympathize with bottlenecks on development. I just hope over time these issues get addresses and get priority

Noah Rosenblatt

Talking Manhattan on
Staff member
I just want to bump this up and say that Marketwide chart data points regen is done. Now we have to regen all resales and new dev data points. Will take a week or so under the hood