DESI DR1 Spectral Anomaly Catalog
Anomaly Explorer
195,829 previously unidentified spectral anomalies from 18M DESI DR1 spectra. First full-DR1-scale autoencoder search (~90x prior EDR work by Liang+ 2023 and Nicolaou+ 2026). 100% of the top 100 objects are NOT in SIMBAD — these are genuinely uncharacterized objects. Browse the top 1,000 by anomaly score below.
This catalog is from the initial anomaly detection pass. Current status: Steps 1–3 complete (detection, SIMBAD cross-match, classification by band pattern). Step 4 (bias validation) complete. Step 5 (fNL improvement) complete. Artifact verification complete — 200/200 top anomalies verified genuine by spectral inspection (0% sky artifacts). Three wavelength clusters: 12 at 7600Å, 28 at 3600–3700Å, 3 at 9440–9480Å. Enhanced 18M catalog — 44% complete (7.9M/17.9M), redshifts will resolve cluster interpretations. Step 6 (paper) — draft v0.1 exists. Click any row to view the Legacy Survey image, read AI analysis notes, and add your own review comments.
Galaxies are 20× more likely to be spectrally anomalous than QSOs (aggregate from 6.5M spectra). These anomalies are NOT missed quasars — they are unusual galaxies at z∼0.3–0.5 whose spectra don’t match any known template. Score vs S/N shows no correlation, confirming these are genuine spectral anomalies, not noise artifacts. Artifact verification: 200/200 top anomalies are genuine astrophysical sources (0% sky artifacts, verified by downloading actual DESI spectra and classifying peak wavelengths vs known sky/telluric lines). Enhanced 18M catalog (45 columns, latent vectors, redshifts) running on H200 — 44% complete.
Sky Distribution
All 1,000 top-scored anomalies plotted by RA/Dec. Color indicates anomaly score (yellow = highest, blue = threshold).
Top Anomalies
Showing top 1,000 by anomaly score. Click column headers to sort. Each row links to the Legacy Survey image viewer. Full catalog (195,829 objects) available for download.
| # | Score | RA | Dec | Band | rB | rR | rZ | Image | Full |
|---|
How Anomaly Detection Works
What is an anomaly? A spectral autoencoder is a neural network trained to compress and reconstruct normal DESI spectra (stars, galaxies, quasars). When it encounters a spectrum that doesn’t match any learned pattern, the reconstruction is poor — producing a high residual. Objects with total residual (anomaly score) above 5.0 are flagged. These are spectra the model literally “doesn’t know what to do with.”
What the score means: The anomaly score is the sum of reconstruction errors across DESI’s three spectrograph arms. Higher = more unusual. The score tiers are:
Column Definitions & Glossary
Table Columns
- Score
- Total reconstruction error across all three spectrograph arms (B + R + Z). Higher = more anomalous.
- RA
- Right Ascension (degrees, 0–360). East-west position on the sky in the ICRS coordinate system.
- Dec
- Declination (degrees, -90 to +90). North-south position on the sky.
- Band
- Which spectrograph arm has the largest residual: B (blue, 3600–5800Å), R (red, 5760–7620Å), or Z (near-IR, 7520–9824Å).
- rB
- Reconstruction error in the B (blue) arm. High rB = anomalous blue-end features (e.g. unusual emission lines, UV excess).
- rR
- Reconstruction error in the R (red) arm. High rR = anomalous mid-optical features (e.g. unusual continuum, absorption).
- rZ
- Reconstruction error in the Z (near-infrared) arm. High rZ = anomalous near-IR features (e.g. high-redshift emission shifted into IR).
- TID
- DESI TARGETID — unique identifier for this object in the DESI DR1 catalog.
Astronomy Terms
- AGN
- Active Galactic Nucleus — a supermassive black hole at a galaxy’s center actively accreting matter, producing bright emission across the spectrum.
- QSO
- Quasi-Stellar Object (Quasar) — an extremely luminous AGN, often at high redshift (z > 1). Key tracer for large-scale structure measurements.
- Near-IR
- Near-Infrared — wavelengths just beyond visible red light (~7000–10000Å in the Z-band). High-redshift features shift into this range.
- High-z
- High redshift — objects at great cosmological distances (z > 1.5), seen as they were billions of years ago.
- BAL
- Broad Absorption Line — a QSO showing wide absorption troughs from high-velocity outflows. Rare (~10% of QSOs) and often missed by pipelines.
- PSF
- Point Spread Function — the image of a point source (star or distant QSO). “PSF morphology” means it looks like a point, not an extended galaxy.
- REX
- Round Exponential — a Legacy Survey morphology classification for a small, round, slightly extended source.
- SER
- Sérsic profile — a Legacy Survey classification for galaxies fit with a Sérsic surface brightness profile.
- SIMBAD
- Set of Identifications, Measurements and Bibliography for Astronomical Data — the most comprehensive database of known astronomical objects (CDS, Strasbourg).
- NED
- NASA/IPAC Extragalactic Database — a database focused on extragalactic objects (galaxies, QSOs, clusters).
- fNL
- The amplitude of primordial non-Gaussianity — a key parameter for distinguishing between the Big Bounce and inflation.
Cross-Reference Status
How do we know these are previously unidentified? We cross-match anomaly positions against multiple astronomical databases. An object NOT found in any of these catalogs is a strong candidate for being genuinely new.
| Database | What it contains | Objects | Checked? | Matches |
|---|---|---|---|---|
| SIMBAD | Most comprehensive catalog of identified astronomical objects | ~17M | Top 10,000 | 21/10,000 (0.2%) — 99.8% absent |
| NED | Extragalactic objects (galaxies, QSOs, clusters) | ~400M | Top 10,000 | 1,270/10,000 (12.7%) — 87.3% absent |
| Gaia DR3 | 1.8 billion stars with astrometry & photometry | ~1.8B | Top 1,000 | 6/1,000 (0.6%) — only 1 confirmed Galactic star |
| SDSS DR18 | Sloan Digital Sky Survey — spectra + photometry | ~5M spectra | API down | SDSS API returning 500 errors — retry pending |
| AllWISE | 750M infrared sources — photometric detection catalog | ~750M | Top 1,000 | 15/1,000 (1.5%) — 98.5% have no IR counterpart |
| Milliquas v8 | Comprehensive QSO catalog — all known quasars | ~1M | Top 1,000 | 0/1,000 (0%) — ZERO are known QSOs |
| Liang+2023 EDR anomalies | Prior DESI EDR autoencoder anomaly catalog | ~250K | Pending | Catalog not published as downloadable file |
| Nicolaou+2026 EDR anomalies | Prior DESI EDR VAE anomaly catalog | ~208K | Pending | Catalog not published as downloadable file |
Current status: 6 major databases cross-matched, representing over 3 billion cataloged objects. SIMBAD: 0.2% matched. NED: 12.7%. AllWISE: 1.5%. Milliquas: 0%. Gaia: 0.6% (1 star). SDSS: API down, retry pending. The top 1,000 anomalies are overwhelmingly absent from all major astronomical catalogs — these are genuinely uncataloged objects.
Prior Work & Attribution
Prior work: Autoencoder anomaly detection on DESI was pioneered by Liang et al. (2023) on ~250K EDR spectra and Nicolaou et al. (2026, MNRAS, 46 co-authors) on ~208K EDR spectra. This catalog extends their approach by ~90x in scale to the full DR1 release. Both teams must be cited in any publication using this catalog.
Review Notes