HATLAS - a fedora data project

Hatlas News: 2026-05-08

News #

If you’re short on time, jump to the status brief below for the 30-second version. If you have any suggestions for how to improve this space, hit me up in Matrix or email.

This week was another huge week of laying foundation – this time, with yours truly diving into privacy policy and data protections in addition to infrastructure. What could go wrong?

Community Updates #

New member: @andungu #

Anne Ndung’u has joined us as our very first analyst to go through our new onboarding process! (More on that below.) She is a Data Science student from Nairobi and a current Outreachy intern.

She has an interest in AI and is currently writing a paper on AI erasure and bias, which looks at how data systems can sometimes overlook specific social contexts like gender and race. Her background is in building predictive models for social sciences such as housing market trends. We’re quite excited to put her skills to use here!

You can check out more on her blog at https://dev.to/andungu

Split from CommOps #

Now that our code repos are starting to get moving, we’ve decided that FDWG is big enough (and noisy enough) that it warrants splitting off from CommOps into our very own space. This will make it easier to find us and will set the stage for many other changes such as moving most of our repos from my personal Codeberg to Forge and setting up an official Fedora Docs site.

FAS groups have been created, and now we’re waiting on our Forge space.

Additionally, this news feed will (hopefully soon) start appearing on Fedora Planet.

Privacy Updates #

Policy is one of the biggest foundational pieces which needs to be established / addressed across Fedora in order to have a healthy, secure, and sustainable data practice. In short, Fedora’s Privacy Statement and general GDPR stance creates more questions than it answers, and it hasn’t been updated since 2018 when GDPR first became a thing. That’s not sufficient for my taste.

Of course, I’m just one person, with zero authority and zero law degrees. So all I can do is take care of my corner of the world and try to escalate. The lawyers still haven’t picked up the phone, so I’ll try a more-differenter /dev/null.

Volunteer Agreement #

Fedora’s Privacy Statement explicitly calls out that “Fedora is going to collect your activity data and analyze it” (paraphrased). This is what GDPR calls a “Legitimate Interest”. Users agree to this when they create their FAS account.

But who is Fedora, exactly? With a corporation’s employees or contractors it’s pretty clear who is acting on behalf of the corporation, but this is much less clear when no paperwork has transpired and anyone off the street can pick up a shovel and start digging.

If I take some Fedora data and do a bunch of evil things with it that violate GDPR, then did @mwinters do that, or did Fedora do that? What’s the determining factor? Is a FAS account all that’s needed for me to incur legal risk on behalf of Fedora / Red Hat / IBM?

We have been meaning for some time to lay out data handling guidelines for analysts, but I went one step further to address this murky legal water. I reviewed our privacy stance / data operations / etc extensively with Claude and iterated to create a Volunteer Analyst Agreement. This agreement explicitly calls out the “instructions” (required per GDPR) that Fedora is issuing to its analysts. As long as analysts sign this agreement (digitally!) and are following these instructions, they are acting on behalf of Fedora, so it is Fedora doing the analyzing.

Of course, this all has to go before Council for final approval, and they may want Legal to sign off on it. I hope this doesn’t sound too negative (we’re all very busy), but my realistic expectations for a response from Council is somewhere between 6 and 36 months.

Lest that sound unfair (it almost certainly is since I’m terribly uninformed in this area), here is my entire braindump of Council tickets and news that I can remember:

We’re implementing this agreement now since it doesn’t make anything worse, and it potentially covers my butt a little in the event of any major GDPR flare-ups. But I do wish it were possible to take serious things seriously here.

This is also the first step towards correcting many other upstream GDPR issues while avoiding the need to slam the door on all data operations (which I hope IBM legal will agree with me about! Though they’ve broken my heart a dozen times in the past, so I hold no hope if this needs to go through them.) Maybe we can get at least this much passed before the winds of fate conspire to blow me elsewhere 🤞.

Technical Updates #

Multiplayer Ops #

In short, we’ve made huge strides to open up the Hatlas infra. All of the Kubernetes configuration is now available for anyone to work on, and the Ansible config is WIP for the same thing.

This meant deciding upon and implementing a secrets mechanism for shared ops, for which I’ve settled on fnox + age. I honestly am super happy with this middle ground between something like KMS / Vault and “YOLO”.

This also meant writing a stupid amount of docs, which is normally my favorite thing but I’m eager to get past the foundation-building phase!

SSO Everywhere #

Since we’re getting a steady flow of new contributors lately, I’ve decided to tackle the last remaining non-SSO service, which is Postgres. This means that it’s currently not accessible for new users, but I’m WIP’ing as hard as I can folks.

Next (?) #

My standard disclaimer: my priorities shift constantly depending on opportunities and obstacles.

Flock #

Flock is coming! And I think FDWG will have a slot for a presentation + workshop!

FDWG still has sooo many things on the TODO list to get done before I’ll feel ready to announce our work to the world, but the flywheel is starting to accelerate here with the help of our new contributors. I’m hopeful I’ll be able to prepare something interesting and worthy of other people’s time and attention in the next 6 5 weeks…

More team foundations #

TODO lists? Friends don’t let friends use repo issue trackers. But is there one with the great organizational capabilities of Bugzilla? But with a lightweight UI? And OIDC? And open source? Open to suggestions!

FAS federation #

@smoliicek is almost fully onboarded and getting his head around our configs. He may be able to help us get federated with FAS so that people can log in directly to Hatlas with their FAS credentials.

Datanommer Pipeline Automation #

The data dictionary is coming along nicely with some huge contributions this past week from @evelynrp. (Thanks Evelyn!!) I’m really looking forward to finishing the POC of generating our Datanommer SQLMesh pipelines from the dictionary via Argo Workflows.

BI Tooling #

We are almost in position to launch general shared tooling such as Superset! And they just released a brand spanking new Operator. (When will I learn that the cutting edge is aptly-named?)

Docs Updates #

The public side of Hatlas is still way overdue for some updates. We also have a goal to create official FDWG docs to house this info.

Employment? #

I’m still unemployed. I’m trying not to get neurotic about that, though my situation is becoming desperate.

My family and I would be incredibly grateful if you’re able to contribute to any of the hosting costs for Hatlas (roughly $100/mo). ❤️

Status brief #

Done #

WIP #

On-deck #