Activity Overview

You have learned that many data professionals now use conversational AI tools like Gemini and ChatGPT to help them analyse their data and boost their productivity. Gemini and ChatGPT are both large language models(LLMs) that are trained on massive datasets of text and code. LLMs can generate human-like text in response to a wide reange of prompts and questions. In this activity, you’ll discover the capabilities of conversational AI by writing your own prompts for Gemini.

LLM prompts and best practives

Data professionals can use LLMs to improve their data anlysis, perform essential taks, and collaborate with teammates. Here are some useful prompts for data science workflows:

  • Data cleaning. LLMs can automate tasks such as data cleaning and coding. For example, you can ask an LLM to clean a dataset by removing missing values, outliers, and duplicate data.

  • Exploratory data analysis (EDA). LLMs can perform exploratory data analysis (EDA) on datasets. For example, you can ask an LLM to create data visualizations, identify patterns and trends, and calculate summary statistics.

  • Modeling. LLMs can build and evaluate models. For example, you can ask an LLM to build a machine learning model to predict an outcome, and evaluate the performance of the model.

  • Interpreting results. LLMs can interpret the results of models. For example, you can ask an LLM to explain the features that are most important for a model, or generate insights from the results of a model.

  • Collaboration. LLMs can help you collaborate with teammates. For example, you can ask an LLM to create a shared document for a brainstorming session with a team of data professionals.

Pro tip: Be sure to structure your prompts in a way that makes it easier for the LLM to fulfill your requests and answer your questions.

The following suggestions are best practices for writing prompts for LLMs:

  • Be clear and concise in your instructions. It is important to be clear and concise in your instructions so the LLM can understand how to help you. Details are great—just make sure they’re useful and relevant. Avoid giving the LLM unnecessary information.

  • Be precise. When posing a question to an LLM, be precise about the input (if any) and the desired output.

  • Include a description of LLM’s role. This reinforces the purpose of your prompt. For example, you can tell the LLM to assume the role of a data scientist by writing “Act as a data scientist” or “You are a data scientist.”

  • Provide context. Providing context allows the LLM to understand the nuances of the relevant issue and generate more informed responses.

  • Try multiple prompts. Trying different prompts can provide different perspectives on a problem and enable the LLM to generate a variety of useful responses.

To help get you started, consider the following specific examples of prompts that data professionals can give an LLM:

  • “Act as a data scientist and write a detailed plan for a credit card fraud detection project.”

  • “I have a dataset of customer purchases at an online retail store. Act as a data scientist and write Python code for data visualization and exploration.”

  • “I have a dataset of customer characteristics and churn for an online video streaming service. Act as a data scientist and create a shared document for a team meeting.”

  • “Act as a data generator and use Python code to generate a CSV file that contains mock employee data for a restaurant chain named Fast. The dataset has 100 rows and 5 columns. The columns are name, address, employee_id, department_id, email.”

  • “Act as a communications expert and share best practices for explaining a data science report to a business executive with no technical background.”

Note: LLMs are powerful, but they are still under development – including Gemini, which is still experimental research. As a data professional, it’s important to use your own judgment when interpreting the results. LLMs can generate insights that you may not have thought of on your own; however, it’s ultimately your responsibility to verify the results and make sure they make sense.