Automating Diamond Shield: Python Dataset Generation for Communication Training
Main contact

Project scope
Categories
Artificial intelligence Data analysis Data modelling Data science Data visualizationSkills
communication strategies python (programming language) data science communication chatbot scalability simulations automation artificial intelligenceDiamond Shield is a methodology we have developed for generating creative responses to manipulative communication. While the theory and response patterns are already defined, what we need now is a scalable dataset to train our AI chatbot, ChatBoy.
This internship focuses on the data science side of the project: using Python to take an existing pattern of manipulative maneuvers and comeback strategies and automate the production of structured datasets. Students will generate large numbers of response examples, tag them by manipulation type and response “flavor,” and prepare the data for future visualization and integration into training simulations.
The core challenge is not inventing new communication strategies, but rather automating and organizing data at scale. The output will be a structured and well-tagged dataset, demonstrating how Diamond Shield responses can be systematically categorized and analyzed—an essential step in training AI systems for more advanced simulation environments.
By the end of the internship, students will have transformed our existing Diamond Shield response pattern into a structured, scalable dataset suitable for AI training. Using Python, they will automate the generation of responses, tag them by manipulation type and by response “flavor” (e.g., calm, witty, direct), and organize the data for future use in simulation environments.
Deliverables will include:
- A Python script that automates the generation and tagging of response data.
- A structured dataset containing thousands of examples, aligned with manipulative maneuvers and categorized by flavor.
- A tagging schema and documentation describing the structure, categories, and logic used.
- A basic visualization or summary report showing response distribution across types and flavors, demonstrating the dataset’s readiness for analysis and integration.
These outputs will directly support the next stage of ChatBoy’s development—moving from static quiz questions toward more dynamic training simulations—while giving students hands-on experience in automation, tagging, and dataset design, core skills within data science.
Providing specialized knowledge in the project subject area, with industry context.
Sharing knowledge in specific technical skills, techniques, methodologies required for the project.
Direct involvement in project tasks, offering guidance, and demonstrating techniques.
Providing access to necessary tools, software, and resources required for project completion.
Scheduled check-ins to discuss progress, address challenges, and provide feedback.
Supported causes
The global challenges this project addresses, aligning with the United Nations Sustainable Development Goals (SDGs). Learn more about all 17 SDGs here.
About the company
Representation
Diversity and inclusion
Categories highlighting this company’s ownership and values
BIPOC-Owned Community-Focused Immigrant-Owned Minority-Owned Neurodivergent-Owned Small Business Social Enterprise Women-OwnedFIA is developing tools and content for women to help spot powers moves and respond 10X more effectively, whether at work, in love, or with family members. By learning the skills of social discernment, you can keep your peace and your power when others try to throw you curveballs.
Main contact
