Data Science - Machine Learning

Data Preprocessing


  • Basic
    • For higher accuracy
  • Normalization (Min-Max Normalization)
    • Scaling technique that rescales data of features in the range [0, 1]
    • X' = X-min/max-min
    • Code
          from sklearn import preprocessing
          scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))
          rescaled = scaler.fit_transform(data)
      
  • Standardization (Z Score Normalization)
    • Scaling technique that rescales data such that they will have properties of standard normal distribution with mean (μ=0), standard deviation (σ=1)
    • Z = X-μ/σ
    • Code
          from sklearn import preprocessing
          scaler = preprocessing.StandardScaler().fit(data)
          rescaled = scaler.transform(data)
      
  • Binarization
    • Scaling technique that rescales data into "0" or "1"
    • Used when dataset contains probabilities
    • Code
          from sklearn import preprocessing
          scaler = preprocessing.Binarizer(threshold=0.5)
          rescaled = scaler.transform(data)
      

Data Science Life Cycle (DSLC)


  • Business Requirement (BR)
    • Deciding the objectives/goals
  • Data Acquisition (DA)
    • Collecting data from Source that suits your Objective
  • Data Processing (DP)
    • Making raw data suitable for model training
  • Data Exploration (DE)
    • Understand data
  • Modelling
    • Choosing a model and applying
  • Deployment
    • Deploy in an environment
    • Technology
      • Backend => Django and Flask
      • Docker and Kubernetes
      • Streamlit

Jupyter Notebook


  • Installation
    • pip install jupyter => Install Jupyter
    • pip install notebook => Alternate way
  • Launch Notebook
    • jupyter notebook => Opens a server in browser, Runs notebook
    • Click New > Python 5 => Open a new Notebook
  • Shortcuts
    • enter => New line
    • shift + enter => Run
    • tab => Suggestion/Autocomplete
  • Commands
    • !mkdir name => Create file
    • %lsmagic => Shows magic commands
    • %%HTML => To write html
    • %matplotlib inline => To write matplotlib commands
Share: