Artificial intelligence applications can be used in scientific information retrieval at various stages. It is particularly suitable for the planning phase of information retrieval and for actual search in occasions that require only a small number of references. AI has also been integrated into many existing information retrieval tools.
At least for now, AI cannot generate extensive searches from diverse sources with comprehensive coverage of references. The answers produced by AI must always be checked by yourself, as is often the case with other uses of AI too.
Prompt and discuss:
Utilize the conversation option of AI applications in your information search. You can ask follow-up questions to the application either on your own or by choosing from ready-made questions. Through conversation, you often get more precise answers and a better overall understanding of the topic. Test various terms in your prompts and try different conversation starters.
Follow the rules:
It is important to follow the guidelines provided for AI use in studies and research. If the use of AI is permitted, it is generally recommended to indicate how it has been used in the work. Check out UEF's general guidelines on the use of AI.
Remember data security! Do not provide personal or confidential information to AI. See the UEF guidelines for more information.
AI applications for different stages of information retrieval
Primarily, it is recommended to use AI applications acquired by UEF, the provider of which has an agreement on the processing of personal data.
Prefer Microsoft Copilot Enterprise-version:
An AI application suitable for information retrieval needs at UEF is Copilot. You must sign in to UEF's Copilot through your Microsoft account. See instructions on the UEF guidelines mentioned above.
With the help of AI-generated answers, you can quickly familiarize yourself with a new topic: what it is about, what kind of aspects are involved. AI is also a good tool in defining concepts. Read more about exploring the topic and concepts (without AI) in the Guide to Information Retrieval.
You can proceed in two ways. In both cases, be prepared for the possibility that AI may “hallucinate,” i.e., produce inaccurate text.
1. Start by asking AI about the topic. The answer will help you get an idea of what the topic includes and where to head your information retrieval. Note that if the topic is completely unfamiliar to you, it is difficult to assess the accuracy of the AI-generated answer. Therefore, verify the information from proper scientific sources.
Examples of how to start conversation:
- “What means microaggression?”
- “What kind of actions can be done to prevent eutrophication of waters?”
2. Familiarize yourself with the topic elsewhere first, for example, by reading research literature, and then ask AI for more specific questions. When you are already more familiar with the topic, you will be better able to assess the accuracy of the answers. You can also get more precise or more relevant answers when you include concepts and terminology that accurately describe your research topic in the prompt.
Examples of how to start conversation:
- “How can teachers promote a positive climate at elementary schools through inclusive classroom activities?”
- “I'm in the process of brainstorming the subject of my thesis. I am familiar with Lev Vygotsky's theory of the zone of proximal development through research literature. In my thesis, I want to compare Vygotsky's theory to some other theory of learning. List five learning theories that I can possibly use as a reference in my thesis.”
Examples of suitable AI applications:
- Copilot (Microsoft) – recommended
- ChatGPT 4o-versions (OpenAI)
- Gemini (Google)
Keywords:
AI applications are already quite good at suggesting keywords for information retrieval, especially in English. Applications are capable of finding word equivalents and synonyms in different languages.
However, they are not very good at formulating keywords in such a way that they work effectively when searching for information in databases specialized in scientific publications. Typically, the application presents rather long expressions, for example, ‘dietary effects on gut microbial diversity’ or ‘the importance of sleep for memory function’. These need to be broken down into simple, separate terms, e.g., ‘diet - gut microbes - diversity’ or ‘sleep - memory’.
Keyword lists are rarely comprehensive, at least not on the first try. You can ask the application to provide more terms and refer to the types of words you wish for. You can request plenty of keywords, from which you can choose the best ones. To get a more comprehensive view, you can try multiple AI applications.
Queries:
The best AI applications are also able to formulate ready-made search queries for databases: keywords are combined with operators and phrases are marked properly. However, word truncation is usually lacking and the formulation of individual keywords is unfinished.
Always evaluate yourself which keywords are actually useful for your topic. Also, check the logic and functionality of the search query. In the picture you see a correct query model and points to pay attention to.
Examples of how to start conversation:
- “Can you find synonyms for the term sensory defensiveness?”
- “Tell me keywords for information search about the topic how the use of wood affects fine particle emissions.”
- “If there are two main concepts: environmental effects and vehicles, can you give me synonyms and related search terms for these both in a table form, where concepts are in columns.”
- “Use Boolean logic to find publications about eating disorders and young people. Use several synonyms for both concepts.”
Examples of suitable AI applications:
Copilot (Microsoft) – recommended
ChatGPT 4o-versions (OpenAI)
Gemini (Google)
Perplexity
FintoAI is an application specializing in keywords. It suggests words based on YSO thesaurus in Finnish, Swedish or English. FintoAI is given a text as input, e.g. a summary describing the topic.
When searching for information, it is advisable to first distinguish between applications that do not search for information themselves but base their entire response on a generative language model, i.e., simply on the probabilities of word occurrences. An example of this is the basic version of ChatGPT, first published in 2022, which is neither a database nor a search engine.
Many other applications utilizing language models perform the task using genuine sources:
Applications utilizing generative language models
Choose an application, where genuine information retrieval is done from real sources. The application uses these to generate an answer. Depending on the application, the search can go through the open web sources (e.g. Copilot and ChatGPT 4o) or a specific database containing selected information, such as open access scientific articles (e.g. Elicit). The result of the search is a short answer to the question or a summary of the topic, along with the sources used in the answer.
Formulate the question clearly but concisely and include sufficiently precise terms to get a relevant answer. Partition a complex task into smaller parts.
The answer produced by AI is often based on a very small number of documents. There is little information on how the application selects the sources it uses, which undermines the reliability of the answer. Therefore, check the original sources which the AI refers to. Also, do your own search from other sources to get a more complete picture.
Note that AI does not always act the way it is asked of. For example, even if you ask the AI application to base its response only on scientific or peer-reviewed articles, it may not necessarily do so.
Examples of how to start conversation:
- “What are the key challenges in addressing environmental exposure in the Global South?”
- “What is the role of internal communication in organization change, based by articles in peer-reviewed journals?”
- “I am writing an academic essay about eating disorders in young adults and supporting the healing process. Could you suggest academic articles? Try to find peer-reviewed publications.”
Examples of suitable AI applications:
- Copilot (Microsoft) – recommended
- ChatGPT 4o-versions (OpenAI)
- Elicit
AI is used in search engines in ways other than based on generative language models, too:
Applications structuring and visualizing search results
In these search engines, information retrieval is proceeded in the traditional way, with search queries. The mentioned database contains open access scientific publications, such as articles, making it reliable source of information. The application uses AI to group the search results, making it easier to evaluate the themes related to the topic. The result of the search is a list of references.
Example of a suitable application:
- Open Knowledge Maps
Search by existing article
Some applications use an existing text or document suitable for the topic, such as an article or an abstract, as a starting point. The application analyses the text and searches for similar references. The result of the search is thus a list of references.
Examples of suitable applications:
- Connected Papers
- Elicit
- Keenious
- Research Rabbit
Many traditional databases (e.g., Scopus, Web of Science) contain a similar function that suggests further references based on the selected article.
Different referencing styles have their own general rules how to cite AI. E.g. APA style for ChatGPT in-text citation is: (Open AI, 2024) and reference list entry:
Open AI (2024). ChatGPT (May 13 version) [Large language model]. https://chat.openai.com/chat.
Read more:
How to cite ChatGPT? (APA)
How do I cite generative AI in MLA style?
The Chicago Manual Style / Q&A
Nevertheless, always follow the guidelines of your own discipline or publisher regarding the use and reporting of AI and citing to it.
Limitations of AI in information retrieval
Restricted sources:
When using an AI application that genuinely retrieves existing sources, the answer produced by AI is often based on a very small number of documents. There is little information on how the application selects the sources it uses, which weakens the reliability of the answer. AI applications can reach only open access materials. Some of the most relevant sources may be missing altogether. Therefore, check the original sources to which the AI refers. Also, conduct your own search in actual scientific databases to get a more comprehensive view.
User accounts and charges:
Many AI applications can be used for free, but their capacity and features are limited. Payments provide more services. Even free versions often require the creation a user account.
Privacy:
From a privacy perspective, there may be problems with the use of AI applications. Applications can store personal information and conversations, which can be used for the tool’s own purposes and possibly transferred further.
Copyright:
AI-generated outputs do not have copyright. Instead, the AI application itself may infringe copyright when using material found online without permission. An AI application that cites its sources is a safer choice in terms of copyright.