Manav Goyal

Neural Network - Artificial Intelligence

Input Data > Meaning > Learning and Improvement
NLP (Natural Language Processing)
- How human communicate using natural language
- Pipeline
  - Sentence segmentation > Word tokenization > Stemming > Lemmatization > Stop words
- Applications
  - Speech Recognition, Sentimental Analysis, Machine Translations, Chat Bots
NLU (Natural Language Understanding)
- Understanding what users say and their intent
- Challenges
  - Lexical Ambiguity
  - Syntactic (Structure) Ambiguity
  - Semantic (Meaning) Ambiguity
  - Pragmetic (Interpretation) Ambiguity
NLG (Natural Language Generation)
- It should be Intelligent and Conversational
- Deal with structured data
- Text/Structure planning
- Corpus => Complete knowledge base
Machine Learning
- Supervised
  - Training Data
  - Both Inputs & Outputs
  - Classification
  - Naive Bayes algorithm
- Unsupervised
  - Only Inputs
  - Clustering
  - K-Mean
- Reinforcement
  - Reward/Penalty
  - Q-Learning
Genetic Algorithm
- Abstraction of real biological evolution
- Solve complex problems => NLP Hard
- Focus on optimization
- Population of possible solutions for a given problem
- From a group of individuals, Best will survive
- Phenotype > Encode > Genotype > Decode > Phenotype
Learning Algorithms
- Syntactic
- Semantic
Outlier Detection
- IQR (Interquartile Range) method
  - Outliers are identified using the lower and upper limits based on the IQR and filtered using boolean indexing
- Percentile method
  - Outliers are identified using a high percentile (e.g., 99th percentile) and filtered using boolean indexing
- Z-score method
  - Z-scores are calculated using stats.zscore(). Values within a specified range (e.g., ±3 standard deviations) are retained in a new DataFrame.