Skip to content
    Back to all Bounties

    Earn 70,200 ($702.00)

    Time Remainingdue 2 years ago
    In Progress

    TypeScript Library for Record Comparison and Matching with External Libraries and Services Integration

    an123321
    an123321
    Posted 2 years ago

    Bounty Description

    #Overview
    We are seeking an experienced TypeScript developer to create a library that compares and matches records from two tables in a SQL Server database. The library needs to be compatible with Node.js and use strong typing. The tables contain similar data about companies, with some differing and partially populated fields. The key is that the values in the two tables won't exactly match up between the name data.

    #Background
    The two tables in question are named Account and _990.

    The Account table contains the following fields: ID, Name, Acronym, AddressLine1, City, StateProvince, PostalCode, Country, PhoneNumber, and Website. Please note that not all fields, particularly address fields, may be populated for all records.

    The _990 table contains the following fields: EIN, Name, ICO, Street, City, State, Zip.

    The library should identify and match records between the Account and _990 tables, even if they have slight differences. Matches should be made based on the similarity of data in the Name, Acronym, City, StateProvince/State, and PostalCode/Zip fields.

    You come up with the methodology/algorithm to do this. You might do web lookups using the domain to try to scrape more data from the site to help the match, you might use other techniques like open source datasets or even consider low-cost paid SaaS services.

    #Project Requirements
    We require a TypeScript library that does the following:

    Record Matching: Identifies and matches records between the Account and _990 tables. Output all the matches into a 3rd table called Account990Match and that table will have the AccountID, EIN, and a ProbabilityScore field

    Match Scoring: Calculates a probability score for each match.

    Output: Stores the matches and their respective probability scores in a third table. There can be zero to multiple possible matches per Account.

    SQL Server Compatibility: The library should be designed to work against a SQL Server database on Azure.

    External Libraries and Services Integration: The library may utilize open-source external libraries, as long as they have at least 1,000 GitHub stars. The library may also pull web data from the Account.Website field to gather additional information, and propose the use of low-cost third-party services for extra data if it significantly improves matching accuracy.

    Please note the coder will NOT have access to our database directly. We will provide sample data in CSV files that you can use for development and testing purposes.

    #Deliverables
    The end product should be a complete TypeScript library, compatible with Node.js, that satisfies the requirements outlined above. The library should include some basic documentation on how to use it and should be built with strong typing/strict mode flags in TypeScript turned on.

    #Evaluation Criteria
    Submissions will be evaluated based on the following criteria:

    #Completeness: The solution should meet all the requirements specified above.
    #Accuracy: The matching algorithm should accurately identify matches between the Account and _990 tables.
    #Code Quality: The code should be well-structured, easy to understand, and include comments where necessary.
    #Documentation: The library should include clear documentation on how to use it.
    #External Libraries and Services Integration: The library should effectively integrate external libraries and third-party services (IF NEEDED) to improve matching accuracy and performance, following the specified criteria.