List of seminars offered during the second semester, course 2022/2023:

Seminar 1: Disease understanding: Dealing with complex and unstructured big data in biomedical domain

  • Professor / Speaker: Alejandro Rodríguez González
  • Affiliation: UPM, Biomedical Technology Center CTB
  • Contact: Prof. Alejandro Rodríguez, alejandro.rg@upm.es
  • Summary: Big data applications in the Healthcare Sector indicate a high potential for improving the overall efficiency and quality of care delivery. Unstructured data represents a powerful untapped resource—one that has the potential to provide deeper insights into data and ultimately help drive competitive advantage.  This unstructured data now makes up a very significant portion of the data, and all kind of companies care rapidly exploring technologies for analyzing this kind of data to gain competitive advantage. Solutions to analyze these kinds of data can be applied in other domains using similar nature data sources. In the healthcare sector, big data analytics has still to address several technical requirements such as: i) use of Electronic Health Records (EHR)  and its implications; ii) preprocessing of natural text iii) annotation of images; iv) dealing with data silos and building of solutions avoiding them, etc. This seminar focus on the concept of disease understanding, a very relevant field that allows having a better comprehension about diseases, how they are related, and how these relationships can be used for the improvement of the biomedical domain and sector. Disease understanding can be improved through the acquisition and analysis of data from both structured and structured sources. This seminar focus on the retrieval of such information for the aforementioned disease understanding goal.


Seminar 2: Drugs4COVID: Combining natural language processing, text mining and knowledge graphs in Health: challenges and a use case

  • Professor / Speaker: Prof. Óscar Corcho (UPM) and Carlos Badenes-Olmedo (UPM)
  • Affiliation: UPM, Ontology Engineering Group, “AI.nnovation Space” UPM Research Center for Artificial Intelligence
  • Contact: Prof. Oscar Corcho, ocorcho@fi.upm.es
  • Summary: In this seminar we will describe how we have used a range of state-of-the-art methods, techniques and tools in the areas of Natural Language Processing, text mining and knowledge graphs to build an online system that allows browsing a large corpus of scientific literature that was created and has been maintained since March 2020, with the emergence of the COVID-19 pandemic. After providing a general overview of why and how we built the system, we will go into more depth in areas such as probabilistic topic models and knowledge-graph-based question answering.


Seminar 3: Big Data Visualization

  • Professor / Speaker: Prof. Mariano Rico
  • Affiliation: UPM, Ontology Engineering Group OEG, “AI.nnovation Space” UPM Research Center for Artificial Intelligence
  • Contact: Prof. Mariano Rico, mariano.rico@fi.upm.es
  • Summary: The semantic web (also know as the Linked Data Cloud) is a huge graph that requires powerful tools to allow an appropriate visualization of its content. In this seminar we will have a practical approach by means of a powerful tool: Gephi. Although this tool is aimed at visualizing and analyzing any graph, we will use specific plugins to analyze large linked-data datasets like DBpedia.


Seminar 4: Neuroscience Focus on Big Data

  • Professor / Speaker: Prof. Ángel Merchan
  • Affiliation: UPM, Biomedical Technology Center CTB
  • Contact: Prof. Ángel Merchan, amerchan@fi.upm.es
  • Summary: Recent developments in Neuroscience are providing large amounts of data that need to be analyzed, interpreted and applied. In this seminar we will briefly review the different processes that are implicated in the acquisition of neuroanatomical data and their analysis by tailored software tools. How this approach help us understand the brain will also be discussed.


Seminar 5: Data Management in Biomedicine

  • Professor / Speaker: Prof. Víctor Maojo
  • Affiliation: UPM, Biomedical Informatics Group
  • Contact: Prof. Víctor Maojo, vmaojo@fi.upm.es
  • Summary: Applications of Artificial Intelligence in medicine began at the beginning of the 1970s, centered at its inception on knowledge-based systems. Two decades later, various limitations of this approach caused the area to partially shift focus towards data-centered applications. The different -omics projects that appeared after the Human Genome Project and the increasingly available electronic health records, clinical trials data as well as a huge number of data resources available over the Web led to directions such as Big Data-related research, among others. In this seminar we will see this evolution, main different approaches, methods and techniques for data management in biomedicine, and its advantages and limitations.


Seminar 6: Location Intelligence at GeoBlink

  • Professor / Speaker: Dr. Marta Borrajo (Senior Data Scientist), Ignacio Platón (Data Scientists) and Marta Benito (Data Scientist)
  • Affiliation: Geoblink
  • Contact: ddominguez@geoblink.com
  • Summary: Selected by Bloomberg as one of the 50 most promising startups in the world, Geoblink is a SaaS-based Location Intelligence solution that combines GIS, analytics and Big Data to help professionals from the retail, real estate, and FMCG industries make informed decisions about their business strategies. In this seminar, Marta Borrajo, Marta Benito and Ignacio Platón, all data scientists with years of experience in the field, will talk about their experience at Geoblink, including data science and engineering, and the structure, history and operation of the madrilian startup.

Seminar 7: Data Science in the industry and sample applications

  • Professor / Speaker: Manuel Guzmán, Director of R&D department
  • Affiliation: Management Solutions, Cátedra iDanae en Analytics y Big Data
  • Contact: Prof. Ernestina Menasalvas, ernestina.menasalvas@upm.es
  • Summary: Dissertation on  the current state of the industry in relation to Big Data and Data Science: Data usage, exploratory data analysis, modelling techniques, data products and specific software.  Analysis on data scientist profile, industry needs and available resources. Discussion on specific case studies and research projects applying  data analysis, machine learning algorithms or web scrapping techniques. Ethics and data science projects.

Seminar 8: Cognitive Accessibility and Easy-toRead Methodology

  • Professor / Speaker: María del Carmen Suárez de Figueroa, PhD.
  • Affiliation: Ontology Engineering Group, ETSII, UPM.
  • Contact: mcsuarez@fi.upm.es
  • Summary: People with disabilities have the right to participate in all the activities in society, such as politics, education, work and culture, in the same way as other people. One of the ways to achieve this guarantee is to improve various accessibility aspects in materials and documents available in any of the aforementioned areas. On the other hand, the numbers of laws and regulations related to accessibility, at local, national and European levels, is continuosly increasing. These rules explicitly mention the need for documents to be written in simple, plain, clear and direct language. This need is directly related to the creation of materials and documents following the guidelines included in the Easy-to-Read Methodology. This methodology provides guidelines and recommendations for writing texts and making materials that are easy to understand by people with reading comprehension difficulties.

Seminar 9: The utopia of AI. Applying natural language processing techniques

  • Professor / Speaker: Noa Cruz Díaz
  • Affiliation: CaixaBank
  • Contact: Noa Cruz Díaz, ncruzd@bankia.com
  • Summary: NLP is living its golden age, in which the disruption of pre-trained language models and transfer learning have been able to dramatically improve the performance of some NLP tasks exceed human-level accuracy. However, what happens when the understanding of language involves speech and its transcription has poor semantic signals? In these cases, errors are propagated throughout the entire methodological life cycle, from the annotation to the final solution, complicating the process so much that AI systems appears to be a huge challenge or maybe even a utopia?


Seminar 10: Natural Language Processing and Understanding in Action - Applications in Misinformation Detection, Earth Sciences and Space. 

  • Professor / Speaker: Jose Manuel Gómez-Pérez,  Andrés García-Silva
  • Affiliation: expert.ai
  • Contact: José Manuel Gómez-Pérez, jmgomez@expert.ai
  • Summary: Natural language processing and understanding (NLP/U) deal with extracting meaningful information and insights from text documents, as well as enabling machines to understand such content in depth, similar to how a human would read a document. To name but a few, NLP/U tools include entity detection, relation extraction, sentiment analysis, text classification or topic modeling, and have successfully found applications in many sectors including health, education, legal, security, defense, insurance, and finance, amongst others. In this seminar, we highlight three domains of application for NLP/U technologies of particular novelty and impact potential: earth sciences, misinformation detection, and science and engineering in Space and use them as vehicles to illustrate the most recent advances in state-of-the-art NLP/U technologies. Among others, we will show how language can be used to automatically identify misinforming text and how to help fact checkers to deal with misinformation, how to extract relevant information from scientific bibliography, assisting scientists in their daily research work, and how to support space agencies like ESA to enforce quality protocols and design the interplanetary discovery missions of the future.


Seminar 11: Challenges in building an AI solution. AI capabilities orchestration and decision models (BANKIA)

  • Professor / Speaker: Stéphane Maraut, Victor Latorre
  • Affiliation: CaixaBank
  • Contact: smaraut@bankia.com, vlatorreg@bankia.com
  • Summary: Creating an AI solution with the ability to solve complex tasks and make autonomous decisions, is not only about building and stacking some Machine Learning models but it requires designing and orchestrating interconnected AI capabilities.  By this, we mean that it is necessary to design adaptative pipelines and incorporate prior knowledge into your models which involves facing challenges such as multiple candidates selection, uncertainty propagation and decision models. To achieve this, we need specific methodolody, adapted AI platform and tools in addition to the classic DS methodology and life cycle frameworks.


Seminar 12: Blockchain for Data Science and Engineering

  • Professor / Speaker: María Salgado Iturrino, Blockchain Manager at Inetum
  • Affiliation: Inetum
  • Contact: María Salgado, m.salgado@inetum.world
  • Summary: Blockchain is an emerging technology which can radically change how business and organizations are run. It treats data in a non-traditional way because it is decentralized and crypto-based.Nowadays, the combination of different technologies is a fact in almost every project. Taking the most of each one, allows industries to achieve most powerful solutions. An in-depth understanding of how blockchain technology works is very interesting for data scientists.


Seminar 13: Atmospheric Science Modelling Systems

  • Professor / Speaker: Prof. Roberto San José
  • Affiliation: UPM, Environmental Software and Modelling Group
  • Contact: Prof. Roberto San José, roberto@fi.upm.es
  • Summary: This seminar is focused on describing and presenting actual atmospheric modelling systems. This covers areas from air pollution, meteorology and finally climate change issue. The seminar is organized in a way that the student receives in two session a wide and complete overview of the different open source atmospheric models which are used today to analyze historical weather patterns and predict or forecast at different scales climate change (different IPCC scenarios), air pollution and even health impacts in humans. The complexity of these models and the use of supercomputer platforms will be underlined and the different data formats and visualization tools will be explained. The seminar will include practical examples of several EU funded and private projects on the area of Environment and ICT during the last 25 years.


Seminar 14: Feature Extraction in Images

  • Professor / Speaker: Prof. Raúl Alonso
  • Affiliation: UPM, Biomedical Informatics Group
  • Contact: Prof. Raúl Alonso, ralonso@fi.upm.es
  • Summary: In this seminar we will review the current state of the art in basic features and algorithms that are used for describing images. These descriptors can be, from basic mathematical features for describing regions of images or even an entire image, to more advanced algorithms for creating a fingerprint of an image based in some key points selected from the original image. Implementation of feature extraction algorithms present in openCV will be reviewed.