Blogs

Resolving the Pain Points of Duplicate IDs at UT Dallas

By Alyssa Phillips posted 11-21-2024 12:03 PM

  

ID management, specifically in the identification, reduction, and removal of duplicate and merged IDs is a crucial and challenging part of maintaining our student information systems. We at the University of Texas at Dallas Office of Enrollment Management have been working on this challenge for many years, and we’ve honestly gotten quite good over that time at both mitigating the creation of duplicate IDs and quickly resolving those that do get created. I presented on this subject at Alliance earlier this year, so some of this may sound familiar to those who attended, but I thought this would be a good avenue to provide information on how we currently handle duplicates, as well as notes about lessons learned and things we’ve done to get a handle on our duplicate IDs and made them more manageable overall.

Locating Duplicate IDs

As part of the Admissions department at UT Dallas, we work primarily with applicants and incoming students, but we also handle the import of test scores, transcripts, and other new data, which is often where duplicate IDs are created. More often than not, we catch duplicates before they get past the admissions stage, which is exactly what we want in order to reduce the impact on the student as well as the amount of work multiple people in multiple departments need to do to resolve the duplicate. The Admission process is sometimes slow until it isn't, so it's important that we are quickly and reliably able to identify duplicates so they can be resolved in a timely fashion.

See Something, Say Something

The frontline of duplicate reporting are our front-facing staff - anyone who interacts directly with applicants and students, and especially those helping troubleshoot issues or finding missing documents. Recruiters, front desk staff, those answering phones or emails are a valuable resource for identifying duplicate accounts, and we find it most useful if we can both encourage reporting and funnel it all into a single place - in our case, an email account with an easily memorable (and to the point) name. 

It's not uncommon to get some false positives reported - twins, or even just coincidental people with similar names and birthdays, but as I always tell our staff, I'd rather them err on the side of caution and report any potential duplicates than let one slip through the cracks.

Use Your Reports

The best way to get ahead of duplicates is to report on them. If you start to see common types of information that match across duplicates, or commonalities in what information was different and caused the duplicate to be created, you can start to report on those things. Sometimes you need really broad reports that may flag a lot of false positives that you check just occasionally, or you can make a few very narrow reports that will find specific duplicate cases but nothing else that you check daily. There can be a middle ground, and we had to do a lot of tweaking on our reports, but getting those reports also finds people before they even have a chance to be reported or have documents go missing (from their main ID at least).

For us, we found that we could often match on first name and date of birth and part of the email, so that's one report that we've built that we use to identify potential duplicates. We can limit the report to only look at new IDs created with a specific time frame, so eventually false positives will fall off the list over time (but don't forget about holidays!). There are other combinations of data points that match and don't match, so when you're working on duplicates, keep an eye out for any patterns that can help make identification easier in the future.

Keep Duplicates In Mind

The last tip I have on finding duplicates is to always keep them in mind. When troubleshooting for an applicant that says they submitted their transcript two weeks ago but you can't find it in their account, look for a duplicate ID. It's not always the case, but it does happen often enough. Check for documents under a different name order, and don't forget to use wildcards. 

Confirming Duplicate IDs

Once you have your reported duplicate IDs, you want to make extra sure that you're working on accounts for the same person. These are the things we keep in mind when confirming our duplicate IDs.

Matching Data

At UT Dallas, we use a "three point" rule for matching IDs and data to make sure that we have the same person. We consider full name to be a single point, and then match on date of birth, email address, phone number, home address, or any number of other data points to make sure we've got the right person. The more match points, the better, and keep an eye out for anything that doesn't match, as you might have two people who just have almost identical information - they could be twins, but they might not be.

And if you know your demographics, this can often help with identifying duplicates. Some cultures use a person's father's name as their middle name which you can use to verify if you have two different people or not, and others use multiple family names, so you can expand your searches using wildcards to find duplicate IDs. Use all the resources you have at hand - resumes and transcripts can help you match records, and even emergency contact information can something be the missing link between possible duplicate IDs.

Consider Twins

Make sure you don't make more work for yourself by merging twins. Keep an eye out for when information differs between your possible duplicate records as it might be indicative of siblings, and we obviously don't want to merge those records. Considering data that may be shared between people who live in the same place versus data that would differ - applicants usually have their own cell phone numbers and email addresses, but may look like they share an email address if they were signed up for a prospective student event by a parent.

Mitigation of Duplicate IDs

 While working on locating and confirming your duplicates, also note the why and how they were created – what data was the mismatch that caused a new ID to be created? Was it an automated process that could be improved or was it human error? Did it come from any particular system or source? How do you prevent this case from happening again and SHOULD you (because twins exist and we also don’t want to auto-merge twins either!)

We can learn a lot about our duplicate IDs while we're fixing them, and these are good lessons to learn and apply to our ID creation processes that can help mitigate the creation of duplicate IDs. We'll never be able to fully prevent them, but by updating Search/Match rules and making sure that anyone who is creating IDs is trained to keep duplicates in mind, we can definitely mitigate the creation of duplicates, and even get ahead on reporting them if someone has to create a new ID that they suspect is actually a duplicate.

That’s a quick summary of how we handle duplicates at UT Dallas and the kinds of lessons we’ve learned from years of de-duplication work and refining our processes. It is my sincere hope that this information is helpful and applicable to anyone else dealing with duplicates, and keep an eye on this space for another post about the flip side of this coin - merged IDs.

0 comments
22 views

Permalink