Invalid album IDs in parquet files

I’m spot-checking some of the data I’m extracting from the Apple Music Feed parquet files and finding numerous issues of invalid album IDs.

For example, just looking at any album with a primary artist id of 163043, I see a few albums that are not available at music.apple.com/us/album/NNN. These include:

981094158 - Time and the River 1803443737 - Celebration (Live New York ’80) 1525426873 - Anything You Want: The Warner-Reprise-Elektra Years

I notice that neither of these album ids are returned using the general Music API, so I’m a bit confused why they would exist in the parquet files at all.

Thanks.

Two other examples worth noting, as I'm trying to determine the best possible path forward.

#1 - For Grover Washington Jr's GROVER LIVE album, the parquet files point to two album IDs - 370892329 and 721265199. The first is invalid, while the second is valid. So with that in mind, I thought perhaps whichever ID has a later last_modified_date may be the more appropriate choice.

#2 - But then I looked at Branford Marsalis / Joey Calderazzo's SONGS OF MIRTH AND MELANCHOLY album. Here, the parquet files provide four IDs (435851031, 1125447166, 1442978087, and 1443138123). Unfortunately, in this case, the modified dates offer no clue and it turns out the first id and the last id work while the other two do not.

Is there any way to determine which links are valid from the data provided within the parquet files?

Invalid album IDs in parquet files
 
 
Q