The life of a data at Wuha
What does Wuha recover?
The solution proposed by Wuha allows to connect a certain number of applications, and consequently, to collect a significant amount of data. The objective being to allow you to get the best information from various sources, Wuha must therefore have access to the data contained in these applications.
The type of data recovered when connecting an application is inherent to the very nature of that application. Connect Wuha to :
Storage applications (Google Drive, Dropbox, Box, Computer...), will give you easy access to your documents by searching through their:
- name and content
- date of creation or modification
- format (pdf, docx, pptx...)
- authors & contributors
Email applications (Microsoft Outlook, Gmail, Slack....) will allow you to find the most relevant emails and attachments by searching with:
- the subject, label or content of the email and attachment
- in the format of the attachment (pdf, docx, pptx...)
- the date on which the email was sent or received
- to the email address of the sender, recipient or other contacts in the email exchange chain
Your data for the best results
Your data is encrypted securely and allows our Data Scientists to offer you the best possible results among the phenomenal amount of data at your disposal! All the work of our team is based on the understanding of natural language. NLP (Natural Language Processing) allows you to link your requests to the content of a document. To do this, the NLP is articulated in a "pipeline": to absorb the complexity of our Machine Learning model, we divide each request into a series of several simpler processes.
In all transparency, here are the steps in question on which our AI (Artificial Intelligence) is based:
- Exploration of the application you are connecting (e.g. Google Drive)
Raw extraction of textual data and document enrichment. This step allows, among other things, to identify:
- the language of the document,
- the type of document (invoice, resume, purchase order, etc.) using a classification algorithm driven by supervised learning techniques.
- similarity with other documents. This allows us to group these documents together for display
Cleaning and enrichment: the data is sent to the Elasticsearch cluster, which is responsible for :
- Processing your request: Our system, developed by our team of Data Scientists, uses Deep Learning techniques to perform NER (Named Entity Recognition) on requests in a fraction of a second. By testing these techniques, we can identify whether this request concerns a person, a date, a place, a nominal group or finally a file extension.
The algorithms articulated in our NLP pipeline extract the best proposals from each connected source. A final treatment by our AI allows us to give you relevant results: according to your experience, Wuha will submit you the most appropriate documents.