The Federal Public Service Policy and Support (BOSA) and AI4Belgium are organizing a hackathon on the theme AI4Gov. Anyone who is interested in harnessing the power of AI for more efficient and/or user-friendly government processes, is welcome to participate! The hackathon itself is currently scheduled for March 2021.
I’m happy to be part of the steering committee for this event. While my background allows me to do in-depth technical analyses of the projects and solutions, I’ll certainly pay special attention to fairness, accountability, transparency, ethics and privacy. By the way, I find the resources that are available on the Flemish Knowledge Center for Data and Society to be of great value in helping with these kind of assessments.
An AI-project is built on vast amounts of data. Good quality data can be hard or expensive to gather, also there are serious privacy concerns when the data pertains to real persons. On European level, the GDPR imposes high standards and restrictions for data gathering, management and usage.
The consumer is optimally protected in this way, but the work of the data scientist does not become easier. As a result the concept of “synthetic data” is getting some traction: fictitious data simulating the statistical properties of the original dataset. Applications are, among others, dataset rebalancing, masking or anonymizing sensitive data, or making simulation environments for machine learning applications.
In Covid-times, naturally all seminars and presentations are converted into webinars - also this one, an updated version of my Pitfalls of AI talk that I’ve given several times now. I update it every time with the latest and greatest in AI failures, it never ceases to amuse!
This webinar was given on invitation by Ordina for their JOIN Ordina JWorks event. The presentation, in English, was recorded and put on YouTube - enjoy!
The fast rise of online translation engines (Google/Bing Translate, Deepl, etc.) changes the job content of professional translators. But are those also useful for the simultaneous interpreter? In this article for the Smals Research blog I take a closer look at the specific challenges in that field, and the state of the technology today.
This article was subsequently polished and republished in IT Daily. (All in Dutch, for an automated translation into English click here).
In my first 2020 article for the Smals Research blog (in Dutch) I describe 5 general questions that we ask ourselves before diving head first into a new AI project. The article links also to a bunch of external resources where more AI management wisdom can be found. For an English translation, you may try running it through Google Translate, without any guarantees as to its accuracy of course ;)
With some regularity I speak for a general audience about my study topics at Smals Research. Lately I’m speaking mostly about the risks that come with AI-projects, and that deserve some more attention amidst all the hyping. AI is not a magic wand with that makes everything work from the first get-go: it’s a complicated set of technologies, and there are many points of attention to consider in order to bring an AI project to a good end. In these presentations I therefore highlight what can go wrong while developing an AI system (training data, confounding variables, objective function), at deployment (attacks against AI systems), the impact on us as citizens (bias, fairness, transparency) and on society as a whole (policy issues, ethics etc.).
This gave rise to a series of presentations given at
In my latest blogpost for Smals Research, I dive into the problem of discovering information in unmanageably large and unknown datasets, and the related problem of anonymizing the results on a large scale. These kind of problems occur sometimes in legal research, investigative (data) journalism or auditing. Learn a bit or two about the concept of e-discovery here (Article in Dutch).
AI is a hype and governments consider it too, as a possible solution for whatever problems they face. To cut through the promo talk and put public services with their feet on the ground again, colleague Katy and I gave a series of well-attended, bilingual, presentations for Belgian public service personnel. What is true and false about AI, and what can you do with it in an (administrative) government context? The slides are available for download.
Next to a short overview of the various flavours of AI, Machine Learning and Natural Language Processing, we also pay attention to the practical side of things: how about data collection and the law, what are the technical requirements, how to organize updating and maintenance of an AI system, and which ethical issues need to be taken into account? We illustrate it with a few small examples that were built within Smals.
This presentation was repeated several times for a varied audience. AI for government will certainly remain a core topic for me in the near future. If you’d like to exchange some ideas, or have some interesting proposals for the application of AI in public services, feel free to contact me!
In my latest blogpost for Smals Research I discuss some of the rists that the latest progress in AI entails for our knowledge society: what’s the impact on e.g. spam, scams, fake news or information warfare? A hot topic with European elections coming up, and of course concluded with some recommendations. (Dutch only for the moment, an English translation will follow.)
In a consultancy assignment for the section analysis and prospection of the labour market of FOREM, the Walloon Employment Mediation Service, I worked as member of an expert panel on their report on the evolution and opportunities of AI-related jobs:
Métiers d’avenir - Les métiers de l’intelligence artificielle (document in French).
Flemish TV channel Canvas organized a new edition of their competition for amateur musicians Speel het hard in late 2018. There was less time to prepare a piece now, but I took the opportunity to tackle Rachmaninoffs 1st concerto. Combined with my new job at Smals leading to less practicing opportunities, I wasn’t really ready on time for the finalist selection early 2019. Still, it’s been a great opportunity to keep on playing. Here too, the project page which includes my practice videos is still online.
In this blogpost for Smals Research I present some of the many Facets of Natural Language Processing. This first article deals with parsing and automatic translation. (Articles are in Dutch, I will publish an English translation soon.)
Edit 07/02/2019: in the meantime a second article is also published. It deals with classification, entity recognition and the more general problem of (syntactic and semantic) ambiguity.
From 2010 to 2018 I was in charge of the Belgian Olympiad in Informatics in Flanders. As such I was also deputy leader for the Belgian team at the following International Olympiads in Informatics:
For my role as cofounder of the Belgian Olympiad in Informatics, the Royal Flemish Academy of Sciences of Belgium awarded me their annual prize for science communication in 2016.
In 2017, Flemish TV channel Canvas organized a competition for amateur musicians: Speel het hard . The mission was to practice a challenging piece for your instrument, getting it to concert hall ready perfection within a timespan of roughly 6 months. A series of videoblogs had to be made to document the process. I took the opportunity to take up piano practice a bit more, and give jazz a try with Kapustin’s 2nd sonata. While I didn’t make it to the final selection, it was great to get my technical abilities back up to the level I had 10 years earlier. The project page, including my practice videos, is still online.
From 2014 to 2016 I set up the online library of the Brussels Royal Conservatory. This included the entire project from analysis to delivery, setting up and customizing a brand new OPAC, a serious effort to increase data quality to an acceptable level, data migration, statistics and monitoring, etc. The 500-page end report of the entire project is available here.
In Boston, October 22-24, 2016, I worked together with Helmut Herglotz on a search engine for music by instrumentation. Music instrumentation metadata is barely standardized, even today you’ll never get good results anywhere. I mostly worked on getting data in a useful format and Helmut put a frontend in Django together, the resulting presentation can be seen on YouTube.
For the hackday in London in 2013 I used several APIs and Python libraries to chop up a music fragment in pieces, extract the melody, and change its pitch according to a predefined scheme. The resulting MidiModulator is published on Github, where I’ve also put a few links to the first (rather funny!) results.
At Queen Mary University’s Center for Digital Music, colleague Joachim Fritsch undertook a detailed comparison of Hennequin’s source separation method with my own, and proposed a hybrid version with better results. His excellent work, including a manually and carefully constructed evaluation database of multitrack audio recordings, can be found in C4DM’s digital repository.
Hackdays don’t always lead to a working result. In London on 3-4 december 2010 I tried my hands at polyphonic audio transcription, by chaining different software libraries that each solved part of the problem. Unfortunately, each also had a significant error margin, cumulatively leading to no sensible output at the end. A list of projects of that weekend can still be found online, though detail pages have disappeared. If I rediscover something in my archives I may expand on this post later.
Detailed information about the ISMIR 2010 paper Evaluation of a score-informed source separation system, including the database used for evaluation, is available here.
The result of MusicHackDay San Francisco, May 15-16 2010: a proof-of-concept plugin for MuseScore that returns a list of available recordings for an opened music score, as given by last.fm . More details here.
Detailed information about the ICMC 2010 paper Source Separation by Score Synthesis, including audio examples and data files, is available on my CCRMA pages
In 2008 I was pretty fond of MuseScore (pre-1.0), which existed mainly on Linux and Windows. I had just switched to Mac to develop other cross-platform software, so I wanted MuseScore on OSX as well. Most of the underlying libraries were cross-platform, and the OSX subsystem is pretty similar to Linux, so it should not be too difficult… I could show a prototype version in December 2008.
It turned out the peculiarities of compiling C++ on mac, and some really dodgy issues with the font and typesetting system, were very hard to solve. With the help of the other MuseScore developers and one particularly useful bugfix in the Canorus project (they had the same font problem), in April 2009 I published the first alpha version of MuseScore on OSX (10.4 or 10.5).
I moved on to other projects and have not contributed to the MuseScore codebase since, but still regularly meet up with the core developers at events like FOSDEM.