Knowledge dispersion, AI and fragmentation of the information sources

2025-01-06T21:00:00Z

One of the coding projects I currently maintain is Autodir, a not-so-known little daemon based on autofs that can be used to create automagically users or group directories at their first use. It is specifically helpful when some kind of shared accounts system is adopted for multiple hosts and the related home directories need to be created optionally and on demand. Well, I recently moved the old repository from SourceForge to GitHub, and that has also been the occasion for me to update the old Docbook howto document for Autodir initially written by the original Autodir developer, Venkata Ramana Enaganti. I mostly maintained the project as a Debian package in the last 20 years or so, with only little interest in feature improvements: it basically just works, and that's more than enough for my use cases.

That has been for me the occasion to re-discover the Linux Documentation Project and the Docbook format used for documentation. Well, for old-school users like me, LDP has been, in the past, a Holy Bible sort of reference. So, what a disappointment for me to find out that the project is frozen under all regards and has missed any update for years. Many of the howtos have become obsolete and never touched in the last 10-15 years or more. Any update to the LDP git repository - if any - is not even reflected in the LDP website. That's a pity because the Autodir howto is still one of the main items pointed out in any web search, but it is sadly obsolete, with little possibility of changing its status. In brief, LDP is not more neither authoritative nor reliable as an information source.

This personal episode is the original reason for this post, which is also about the status of documentation and information provided for many FOSS communities and the possible future of content creation and solid knowledge building for the IT domain as a whole.

Of course, there are still some happy isles where the provided documentation and manuals are a pleasure to consult and represent the primary source of trustable knowledge about a topic. I can think about the BSD or Guix handbooks and some community-driven wikis, such as the Arch one, as well as some Debian guides. Selected tools, libraries and languages also have very lovely and complete documentation that is up-to-date and enjoyable to read.

This is not the case for the average tech topic because, in recent years, building a decent and deep knowledge of any IT argument, library or tool has become a nightmare. Information sources have become dispersed among tons of small websites, blogs, magazines, courses, books, small manuals, papers, podcasts, news articles, or even videos. Often, they are hidden among a plethora of low-quality stuff that is confused among the very few truly informative references. We suffer from an infodemic that every day becomes worse, not only for generalistic information but also for specialistic content. Even paying for information is not more of a guarantee of quality, with books, courses, and e-learning resources often hastily prepared for shallow consumption and which become obsolete at the speed of light (when they are not completely wrong from their beginning). I'm a long standing subscriber of the O'Reilly learning platform and I know what I say.

Sadly, our prospects are even worse thanks to the new generative AI systems, with a probable intensification of automagically created content without even a minimal quality checking. This is something that is already happening at every level, but it is specifically grave in the case of the IT domain, where the combination of AI coding tools, the progressive reduction of seniority and expertise in some fields could upset the whole process of building knowledge and solid experiences for creating a problem-solving attitude from scratch in a few years. Even in the scientific domain, this progressive enshittyfication of the quality level of publications is sadly tangible, therefore some editors already are requiring explicit exclusion of AI tools abuse for publication preparation and reviewing.

Today, junior profiles are probably condemned to marginalization in the IT world thanks to massive generative AI use in coding, but someone should ask themselves how senior profiles could be prepared in the future in this way. For sure even in the FOSS ecosystem it is already very difficult suggesting introductory references and authoritative repositories of reasoned documentation for newbies.

Optimists will say that the whole process has simply changed in the last few years and will change again with the current AI trends: nothing is lost, and I'm merely affectionate to the old-school approaches that need to change and will do again in the future.

Maybe the answer will simply be one-to-one mentorship-based training, as in the far past, to find patiently the right path in the current information jungle. I have no certainty about that.

I hope so, my Padawan. Really, I hope so.

frankie-tales

Knowledge dispersion, AI and fragmentation of the information sources