Files
uni/year4/semester1/CT4101: Machine Learning/materials/topic3/examples/k-NN_hyperparameters.ipynb
2024-09-29 06:45:29 +01:00

654 lines
121 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this worked example, we seek to apply the k-NN algorithm to a dataset called beers.\n",
"\n",
"The dataset has already been split into two different sets: one for training (beer_training.csv) and one for testing (beer_test.csv)\n",
"\n",
"The dependent variable that we are trying to predict is style, which can be one of 3 classes: ale, lager or stout.\n",
"\n",
"In this example we will see the importance of maintaining separate training and test data, as well as how to tune the hyperparameters of a machine learning model."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" calorific_value nitrogen turbidity style alcohol sugars bitterness \\\n",
"0 45.305310 0.459548 1.917273 ale 4.227692 16.67 12.568947 \n",
"1 43.889381 0.548977 3.186364 ale 4.289231 16.73 14.974000 \n",
"2 41.588496 0.542847 1.568182 ale 4.344615 16.48 11.848789 \n",
"3 44.553097 0.480301 1.871818 ale 4.424615 18.59 13.879632 \n",
"4 41.013274 0.441860 2.345455 ale 4.264615 16.35 12.186053 \n",
"\n",
" beer_id colour degree_of_fermentation \n",
"0 167 11.04 62.178571 \n",
"1 128 13.44 63.032857 \n",
"2 88 14.04 63.468571 \n",
"3 147 12.48 63.531429 \n",
"4 74 12.12 63.747143 \n",
"(124, 10)\n"
]
}
],
"source": [
"import pandas as pd\n",
"\n",
"# details for iris dataset - this is a very simple dataset that is easy to get good results on\n",
"# training_file = \"iris_training.csv\"\n",
"# test_file = \"iris_test.csv\"\n",
"# independent_cols = [\"sepal_length\",\"sepal_width\",\"petal_length\",\"petal_width\"]\n",
"# dependent_col = \"class\"\n",
"\n",
"# details for beer dataset\n",
"training_file = \"beer_training.csv\"\n",
"test_file = \"beer_test.csv\"\n",
"independent_cols = [\"calorific_value\", \"nitrogen\", \"turbidity\", \"alcohol\", \"sugars\", \"bitterness\", \"beer_id\", \n",
" \"colour\", \"degree_of_fermentation\"]\n",
"dependent_col = \"style\"\n",
"\n",
"# Here we load our training dataset in from the training file using the pandas library\n",
"df_training = pd.read_csv(training_file)\n",
"print(df_training.head())\n",
"print(df_training.shape)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" calorific_value nitrogen turbidity alcohol sugars bitterness \\\n",
"0 45.305310 0.459548 1.917273 4.227692 16.67 12.568947 \n",
"1 43.889381 0.548977 3.186364 4.289231 16.73 14.974000 \n",
"2 41.588496 0.542847 1.568182 4.344615 16.48 11.848789 \n",
"3 44.553097 0.480301 1.871818 4.424615 18.59 13.879632 \n",
"4 41.013274 0.441860 2.345455 4.264615 16.35 12.186053 \n",
"\n",
" beer_id colour degree_of_fermentation \n",
"0 167 11.04 62.178571 \n",
"1 128 13.44 63.032857 \n",
"2 88 14.04 63.468571 \n",
"3 147 12.48 63.531429 \n",
"4 74 12.12 63.747143 \n",
"(124, 9)\n"
]
}
],
"source": [
"# set up a matrix X containing the independent variables from the training data\n",
"X_training = df_training.loc[:,independent_cols]\n",
"print(X_training.head())\n",
"print(X_training.shape)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 ale\n",
"1 ale\n",
"2 ale\n",
"3 ale\n",
"4 ale\n",
"Name: style, dtype: object\n",
"(124,)\n"
]
}
],
"source": [
"# Set up a vector y containing the dependent variable / target attribute for the training data\n",
"y_training = df_training.loc[:,dependent_col]\n",
"print(y_training.head())\n",
"print(y_training.shape)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" calorific_value nitrogen turbidity style alcohol sugars bitterness \\\n",
"0 41.721239 0.503276 2.628182 ale 4.015385 16.73 10.452789 \n",
"1 42.429204 0.525512 1.776364 ale 4.092308 16.72 10.999526 \n",
"2 45.880531 0.443233 2.628182 ale 4.276923 16.68 13.456368 \n",
"3 45.305310 0.471668 1.806364 ale 4.126154 18.84 9.202737 \n",
"4 38.977876 0.392846 2.272727 ale 4.015385 16.77 9.457895 \n",
"\n",
" beer_id colour degree_of_fermentation \n",
"0 93 13.44 55.337143 \n",
"1 103 12.24 58.380000 \n",
"2 178 10.92 58.382857 \n",
"3 166 10.92 58.525714 \n",
"4 44 10.56 58.900000 \n",
"(30, 10)\n"
]
}
],
"source": [
"# Next we load our test dataset in from the file iris_test.csv\n",
"df_test = pd.read_csv(test_file)\n",
"print(df_test.head())\n",
"print(df_test.shape)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" calorific_value nitrogen turbidity alcohol sugars bitterness \\\n",
"0 41.721239 0.503276 2.628182 4.015385 16.73 10.452789 \n",
"1 42.429204 0.525512 1.776364 4.092308 16.72 10.999526 \n",
"2 45.880531 0.443233 2.628182 4.276923 16.68 13.456368 \n",
"3 45.305310 0.471668 1.806364 4.126154 18.84 9.202737 \n",
"4 38.977876 0.392846 2.272727 4.015385 16.77 9.457895 \n",
"\n",
" beer_id colour degree_of_fermentation \n",
"0 93 13.44 55.337143 \n",
"1 103 12.24 58.380000 \n",
"2 178 10.92 58.382857 \n",
"3 166 10.92 58.525714 \n",
"4 44 10.56 58.900000 \n",
"(30, 9)\n"
]
}
],
"source": [
"# set up a matrix X containing the independent variables from the test data\n",
"X_test = df_test.loc[:,independent_cols]\n",
"print(X_test.head())\n",
"print(X_test.shape)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 ale\n",
"1 ale\n",
"2 ale\n",
"3 ale\n",
"4 ale\n",
"Name: style, dtype: object\n",
"(30,)\n"
]
}
],
"source": [
"# Set up a vector y containing the dependent variable / target attribute for the training data\n",
"y_test = df_test.loc[:,dependent_col]\n",
"print(y_test.head())\n",
"print(y_test.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To explote the effect of hyperparameters on a simple machine learning model, let's experiment with the built-in k-NN implementation in scikit-learn.\n",
"\n",
"https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html\n",
"\n",
"First we'll create a model using the default settings"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Accuracy on training data: 0.7580645161290323\n",
"Accuracy on test data: 0.4\n"
]
}
],
"source": [
"from sklearn import neighbors, metrics\n",
"\n",
"# create a model using the default settings for k-NN, n_neighbors=5, weights=uniform, p=2 (Euclidean distance)\n",
"model = neighbors.KNeighborsClassifier()\n",
"model.fit(X_training, y_training)\n",
"\n",
"# compute the predictions for the training and test sets\n",
"predictions_training = model.predict(X_training)\n",
"predictions_test = model.predict(X_test)\n",
"\n",
"# compute the accuracy on the training and test set predictions\n",
"accuracy_training = metrics.accuracy_score(y_training, predictions_training)\n",
"accuracy_test = metrics.accuracy_score(y_test, predictions_test)\n",
"print(\"Accuracy on training data:\",accuracy_training)\n",
"print(\"Accuracy on test data:\",accuracy_test)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29]\n",
"[1.0, 0.8306451612903226, 0.7580645161290323, 0.6854838709677419, 0.6370967741935484, 0.6048387096774194, 0.5806451612903226, 0.5967741935483871, 0.5645161290322581, 0.5241935483870968, 0.5161290322580645, 0.46774193548387094, 0.46774193548387094, 0.46774193548387094, 0.43548387096774194]\n",
"[0.5666666666666667, 0.4666666666666667, 0.4, 0.4, 0.4, 0.43333333333333335, 0.4, 0.3333333333333333, 0.43333333333333335, 0.4, 0.4, 0.4, 0.36666666666666664, 0.3, 0.3333333333333333]\n"
]
}
],
"source": [
"# Now let's evaluate the effect of using different k values\n",
"# start at k=1 and test all odd k values up to 21\n",
"k_values = list(range(1,31,2))\n",
"print(k_values)\n",
"\n",
"accuracy_training_k = []\n",
"accuracy_test_k = []\n",
"for k in k_values:\n",
" model_k = neighbors.KNeighborsClassifier(k)\n",
" model_k.fit(X_training, y_training)\n",
"\n",
" # compute the predictions for the training and test sets\n",
" predictions_training_k = model_k.predict(X_training)\n",
" predictions_test_k = model_k.predict(X_test)\n",
"\n",
" # compute the accuracy on the training and test set predictions\n",
" accuracy_training_k.append(metrics.accuracy_score(y_training, predictions_training_k))\n",
" accuracy_test_k.append(metrics.accuracy_score(y_test, predictions_test_k))\n",
"\n",
"print(accuracy_training_k)\n",
"print(accuracy_test_k)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# let's plot the accuracy on the training and test set\n",
"import matplotlib.pyplot as plt\n",
"plt.scatter(k_values,accuracy_training_k,marker=\"x\")\n",
"plt.scatter(k_values,accuracy_test_k,marker=\"+\")\n",
"plt.xlim([0, max(k_values)+2])\n",
"plt.ylim([0.0, 1.1])\n",
"plt.xlabel(\"Value of k\")\n",
"plt.ylabel(\"Accuracy\")\n",
"legend_labels = [\"Training (Euclidian dist.)\",\"Test (Euclidian dist.)\"]\n",
"plt.legend(labels=legend_labels, loc=4, borderpad=1)\n",
"plt.title(\"Effect of k on training and test set accuracy\", fontsize=10)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"p = 1 training [1.0, 0.8306451612903226, 0.8064516129032258, 0.75, 0.75, 0.717741935483871, 0.7016129032258065, 0.6854838709677419, 0.6451612903225806, 0.6854838709677419, 0.6451612903225806, 0.5806451612903226, 0.5564516129032258, 0.5645161290322581, 0.5645161290322581] \n",
"\n",
"p = 1 test [0.7, 0.7, 0.5333333333333333, 0.43333333333333335, 0.4666666666666667, 0.4, 0.5, 0.4666666666666667, 0.4666666666666667, 0.5, 0.4666666666666667, 0.5, 0.43333333333333335, 0.5, 0.4666666666666667] \n",
"\n",
"p = 2 training [1.0, 0.8306451612903226, 0.7580645161290323, 0.6854838709677419, 0.6370967741935484, 0.6048387096774194, 0.5806451612903226, 0.5967741935483871, 0.5645161290322581, 0.5241935483870968, 0.5161290322580645, 0.46774193548387094, 0.46774193548387094, 0.46774193548387094, 0.43548387096774194] \n",
"\n",
"p = 2 test [0.5666666666666667, 0.4666666666666667, 0.4, 0.4, 0.4, 0.43333333333333335, 0.4, 0.3333333333333333, 0.43333333333333335, 0.4, 0.4, 0.4, 0.36666666666666664, 0.3, 0.3333333333333333] \n",
"\n",
"p = 3 training [1.0, 0.8064516129032258, 0.7096774193548387, 0.6774193548387096, 0.6290322580645161, 0.6129032258064516, 0.5483870967741935, 0.5403225806451613, 0.5161290322580645, 0.49193548387096775, 0.4596774193548387, 0.43548387096774194, 0.4435483870967742, 0.4435483870967742, 0.41935483870967744] \n",
"\n",
"p = 3 test [0.5333333333333333, 0.36666666666666664, 0.4, 0.3333333333333333, 0.36666666666666664, 0.4, 0.3333333333333333, 0.3333333333333333, 0.4, 0.43333333333333335, 0.4, 0.36666666666666664, 0.3333333333333333, 0.3333333333333333, 0.3333333333333333] \n",
"\n",
"p = 4 training [1.0, 0.7983870967741935, 0.6774193548387096, 0.6532258064516129, 0.6370967741935484, 0.5887096774193549, 0.5403225806451613, 0.5080645161290323, 0.5080645161290323, 0.47580645161290325, 0.45161290322580644, 0.41935483870967744, 0.4435483870967742, 0.41935483870967744, 0.4274193548387097] \n",
"\n",
"p = 4 test [0.5333333333333333, 0.3333333333333333, 0.43333333333333335, 0.3, 0.3, 0.4, 0.3333333333333333, 0.36666666666666664, 0.4, 0.4, 0.4, 0.3333333333333333, 0.3, 0.3333333333333333, 0.3] \n",
"\n"
]
}
],
"source": [
"# Now let's explore the impact of using a different distance metric by changing the value of p used in the Minkowski formula\n",
"p_values = list(range(1,5))\n",
"# print(p_values)\n",
"\n",
"accuracy_training_k_p = []\n",
"accuracy_test_k_p = []\n",
"for j in range(len(p_values)):\n",
" accuracy_training_k_p.append([])\n",
" accuracy_test_k_p.append([]) \n",
"\n",
" for k in k_values:\n",
" model_k_p = neighbors.KNeighborsClassifier(n_neighbors=k, p=p_values[j])\n",
" model_k_p.fit(X_training, y_training)\n",
"\n",
" # compute the predictions for the training and test sets\n",
" predictions_training_k_p = model_k_p.predict(X_training)\n",
" predictions_test_k_p = model_k_p.predict(X_test)\n",
"\n",
" # compute the accuracy on the training and test set predictions\n",
" accuracy_training_k_p[j].append(metrics.accuracy_score(y_training, predictions_training_k_p))\n",
" accuracy_test_k_p[j].append(metrics.accuracy_score(y_test, predictions_test_k_p))\n",
"\n",
" print(\"p =\",p_values[j],\"training\",accuracy_training_k_p[j],\"\\n\")\n",
" print(\"p =\",p_values[j],\"test\",accuracy_test_k_p[j],\"\\n\")"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEUCAYAAAAmxTHXAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/Il7ecAAAACXBIWXMAAAsTAAALEwEAmpwYAAAxeUlEQVR4nO3dfXgU5fXw8e8hRAgQQIgoihqsKIS3BKKiUQxFWyooVFFBfEOr4g/BYgURFfnRxxatltYHrQ+0Yn1XVASRim+kYJRCEERAVJRYYxGBSghvEuA8f8zsugmbZJPsZDOZ87muvbI7O3vPmZnsnp257z0jqooxxpjgapToAIwxxiSWJQJjjAk4SwTGGBNwlgiMMSbgLBEYY0zAWSIwxpiAs0TQgInIQRFZHXGb6E4/R0TWudNSROQP7uM/1GAZk2rwmktF5BMRWVxueq6ILKhue9VctufLqGLZZ9XwtekickWc4vi1iDSLR1umYWic6ACMp/aqamaU6SOA36vq0wAiciPQRlUP1mAZk4DfVfM11wM3qOp7NVien+UCu4D3a/DadOAK4Nk4xPFr4GlgTxzaqhERaayqBxK1fFOWHREEjIj8CrgM+K2IPCMi84EWwEoRuVxEjhKRl0VkhXvLcV/XQkRmi8jHIrJGRC4RkWlAintk8UyUZQ13518rIve70yYDZwN/q+wIREROE5FVIvKTctPTRWSpiHzo3s5yp+eKSJ6IvCQiG9x1E/e5Ae60D4GLK1jetSIyz23jcxG5t4L5Dlsnd/ouEblPRD4SkWUicnT5uIFRwDh3e51TybY+N+IobpWIpALTgHPcaePKtd1eRJa4z60VkXPc6T8TkQ/c7TTH3YdjgWOBxeWPyEL7x41lrYjMjNiGJ4vI2+76fRjaLyJyh7s9PnL/H3C3YbZ7P01ECiO28XwReRd4x43nHbe9j0VkcEQcV7v/Zx+JyFMikioim0Qk2X2+ZeRjU0uqarcGegMOAqsjbpe7058AhkbMtyvi/rPA2e79E4BP3Pv3A3+KmO/I8q8tt+xjgX8DR+Eceb4LDHGfywOyo7wmF1gAnAWsBE6IMk8zoKl7vxNQEPHaYqADzhecD3ASTlPga3deAV4EFkRp91pgM9AWSAHWlo+xinVS4EL3/gPA3VGWMQW4PYZt/RqQ495v4S4rN1rc7jy/Ae5y7ycBqUAasARo7k6/A5js3i8E0ipoq03E/aci1ulfwC/d+03d/fALnKObZpGvjdy/bhyFEdu4KGK+xkDLiPk2uvuoK/BZKMaI+WdHbO8bgYcS/R5rKDc7NdSwVXRqqDLnARnuF0GAliLSwp0+LDRRVb+vop3TgDxV3QrgHjH0BV6t4nVdgJnAz1T1P1GeTwZmiEgmTqI7JeK55apa5C5vNc7plF3AJlX93J3+NM6HSDRvqep2d75XcBJJQYzrtB8niYGTxM6vYj2h4m2dD/zRbf8VVS2KmCeaFcDj7rfjV1V1tYicC2QA+e5rj8BJjlXpJyITcD7o2wDrRCQPOE5V5wKo6j4AETkPmK2qe9zp/42h/bci5hPgdyLSFzgEHAccDfwUmKOq28q1+1dgAs72HgncEMPyTAwsEZjyGgF9Qm/2kCo+iOJpM843ziwgWiIYB2wBeuLEGhnnDxH3D1L9/+/yhbeqU4irVN2vqtVYdtRtDUwTkdeBC3A+yH9eWSOqusT9MB0IPCEifwS+x/nQHR7rCohIU+BRnG/zX4vIFJx9UV0H+PG0c/nX7464PwLn6Kq3qpa6p5AqXJ6q5runBnOBJFVdW4PYTBTWR2DKexMYE3rgfvMGeAsYHTH9SPduaQXnaZcD57rniJOA4cA/Y1j+DpwPtN+7b/jyWgGbVfUQcBXOqZDKbADS5ce+hso+GM8XkTYikgIMwflmHqmm6xRSgnPaJiTqthaRn6jqx6p6P863/c5RXkvE604EtqjqLJxvzb2AZUCOiJzsztNcREJHTxW1FfoQ3uYemQwFUNUSoEhEhrhtNRFn1NFbwEj3PiLSxn19IdDbvT+0ku3RCvjOTQL9gBPd6e8Cl4pI23LtAjyJc0ptdiXtmmqyRNCwhTpyQ7dpMbxmLJDtdtStx+ngBPg/wJFuJ+JHQD93+kxgjZTrLFbVzcBEYDHwEbBSVefFErSqbgEGAY+IyBnlnn4UuMaNoTNlv2FGa2sfzqmg18XpLP6uktmXAy8Da4CXVTXytFCt1sn1GvBLd1+cQ8Xb+tfudl4DlAL/cGM66HaejivXbi7wkYisAi4H/uyevroWeM5t5wOc7QXOPnujfGexqu4AZuH0jyzCSUIhVwFj3bbeB45R1TeA+UCBeyrudnfeB4Gb3XjSKtkez7jr/zFwNU7SRlXXAfcB/3T38x/LveZI4LlK2jXVJD8ezRoTXCJyLc4pkVsSHYupmIgMBQar6lWJjqUhsT4CY4wviMj/xRmpdEGiY2lo7IjAGGMCzvoIjDEm4CwRGGNMwPmujyAtLU3T09MTHYYxxvjKypUrt6nqUdGe810iSE9Pp6CgoOoZjTHGhInIVxU9Z6eGjDEm4CwRGGNMwFkiMMaYgPNdH4ExQVBaWkpRURH79pWvR2dM5Zo2bUqHDh1ITo79Ug2WCIyph4qKikhNTSU9Pb0uK78an1NVtm/fTlFRER07doz5dXZqyJh6aN++fbRt29aSgKkWEaFt27bVPpK0RGBMPZWIJLB9+3YyMzPJzMzkmGOO4bjjjgs/3r9/f5Wvz8vL4/33K74k86uvvsrUqVMBmDJlCiLCxo0bw8//6U9/QkRqPER8ypQpPPjgg9V6ze9+9+Mlt3fs2MGjjz5ao2V7acaMGZx88smICNu2bQtPX7BgAZMnTy4zb03+bywRGONj5WuF1bZ2WNu2bVm9ejWrV69m1KhRjBs3Lvz4iCOOqPL1VSWCBx54gP/5n/8JP+7evTvPP/98+PGcOXPo2rVrrdahuvyQCHJycnj77bc58cQTy0wfOHAgr732Gnv27KlV+5YIjPGp6W99xtQF68Mf/qrK1AXrmf7WZ3FdzsqVKzn33HPp3bs3P//5z9m8eTMADz/8MBkZGfTo0YNhw4ZRWFjIY489xvTp08nMzGTp0qVl2vnss89o0qQJaWk/XqJgyJAhzJvnXNLhiy++oFWrVmWev/nmm8nOzqZr167ce++94enp6ence++99OrVi+7du7Nhw4bwc+vXryc3N5eTTjqJhx9+uMyyevfuTdeuXZk5cyYAEydOZO/evWRmZjJixAgmTpzIF198QWZmJuPHj2fXrl30798/vJxQrIWFhXTp0oUbbriBrl278rOf/Yy9e/dWuh2nTJnCVVddxZlnnkmnTp2YNWtWzPsgKyuLaBUVRITc3FwWLFhw+IuqwTqLjfEhVWXnvlJm5xcCMHlQBlMXrGd2fiEjc9JR1bicWlJVxowZw7x58zjqqKN44YUXuOuuu3j88ceZNm0amzZtokmTJuzYsYPWrVszatQoWrRowe23335YW/n5+fTq1avMtJYtW3L88cezdu1a5s2bx+WXX87s2T9efOy+++6jTZs2HDx4kP79+7NmzRp69OgBQFpaGh9++CGPPvooDz74IH/9618B2LBhA4sXL6akpIRTTz2Vm2++meTkZB5//HHatGnD3r17Oe2007jkkkuYNm0aM2bMYPXq1YDzAb927drw4wMHDjB37lxatmzJtm3b6NOnDxdddBEAn3/+Oc899xyzZs3isssu4+WXX+bKK6+sdHuuWbOGZcuWsXv3brKyshg4cCCpqamcc845Ued/9tlnycjIqLTN7Oxsli5dymWXXVbpfJWxRGCMD4kIkwc5HxCz8wvDCWFkTjqTB2XErX/hhx9+YO3atZx//vkAHDx4kPbt2wPQo0cPRowYwZAhQxgyZEiVbW3evJmjjjq81M2wYcN4/vnnWbRoEe+8806ZRPDiiy8yc+ZMDhw4wObNm1m/fn04EVx88cUA9O7dm1deeSX8moEDB9KkSROaNGlCu3bt2LJlCx06dODhhx9m7ty5AHz99dd8/vnntG3bttKYVZVJkyaxZMkSGjVqxDfffMOWLVsA6NixI5mZmeEYCgsLq9wGgwcPJiUlhZSUFPr168fy5csZMmRIOPHURLt27fjPf6Jd3jt2lgiM8alQMgglASCuSQCcD8KuXbvywQcfHPbc66+/zpIlS3jttde47777+PjjjyttKyUlheLi4sOmDxo0iPHjx5OdnU3Lli3D0zdt2sSDDz7IihUrOPLII7n22mvLjIZp0qQJAElJSRw4cOCw6ZHP5eXl8fbbb/PBBx/QrFkzcnNzYxpZ88wzz7B161ZWrlxJcnIy6enp4deVX05Vp4bg8I5cEaGkpKRWRwT79u0jJSWlymVXxvoIjPGpUJ9ApMg+g3ho0qQJW7duDSeC0tJS1q1bx6FDh/j666/p168f999/P8XFxezatYvU1FRKSkqittWlS5cyI4RCmjVrxv33389dd91VZvrOnTtp3rw5rVq1YsuWLfzjH/+o8XoUFxdz5JFH0qxZMzZs2MCyZcvCzyUnJ1NaWgpwWPzFxcW0a9eO5ORkFi9ezFdfVVi3LWzGjBnMmDEj6nPz5s1j3759bN++nby8PE477TRSU1PDHfLlb1UlAXD6Xrp161blfJWxRGCMD4WSQKhPYNPvL2BkTjqz8wvjmgwaNWrESy+9xB133EHPnj3JzMzk/fff5+DBg1x55ZV0796drKwsxo4dS+vWrbnwwguZO3du1M7ivn37smrVqqixDRs27LD+g549e5KVlUXnzp254ooryMnJqfF6DBgwgAMHDtClSxcmTpxInz59ws/deOON4dNcbdu2JScnh27dujF+/HhGjBhBQUEB3bt358knn6Rz585VLmvDhg0VnnLq0aMH/fr1o0+fPtxzzz0ce+yxMcX/8MMP06FDB4qKiujRowe/+tWvws8tXryYgQMHxtRORXx3qcrs7Gy1MtSmofvkk0/o0qVLpfNMf+szdu4rDZ8OCiWHlk2TGXf+KXUUafXceuutXHjhhZx33nmJDsUzgwYN4pVXXjlsuO2UKVMq7EivqS1btnDFFVfwzjvvlJke7f9HRFaqana0djzrIxCRx4FBwHeqethxizgny/6McyHqPcC1qvqhV/EY09CMO/+UMqODQn0G9fnXyJMmTeJf//pXosPwVG2HclbHv//9bx566KFat+NlZ/ETwAzgyQqe/wXQyb2dAfzF/WuMiVG0zsf67Oijjw4PvwyaKVOmxL3N0047LS7teNZHoKpLgP9WMstg4El1LANai0h7r+IxxhgTXSI7i48Dvo54XOROO4yI3CgiBSJSsHXr1joJzhhjgsIXo4ZUdaaqZqtqdrQfpBhjjKm5RCaCb4DjIx53cKcZY4ypQ4lMBPOBq8XRByhW1c0JjMeYwKvrMtSR7WdmZrJjx45qx/zEE09wyy23APDYY4/x5JOHj08pLCwM/+iqoKCAsWPHVns5Xgn9MC8jI4OuXbvy5z//Ofzc7bffzrvvvut5DF4OH30OyAXSRKQIuBdIBlDVx4CFOENHN+IMHx3pVSzGmNiEylBDzca95+Xl0aJFC84666yozz/wwAPMnz8//HjcuHFxHVc/atSoKufJzs4mOzvqcPqEaNy4MQ899BC9evWipKSE3r17c/7555ORkcGYMWO44YYb+OlPf+ppDF6OGhququ1VNVlVO6jq31T1MTcJ4I4WGq2qP1HV7qpqvxIzpiZmD3RuHvGyDHU0kd/wwfmBVl5eHgBvvPEGvXr1omfPnvTv3/+w10ZemGblypX07NmTnj178sgjj4TnycvLY9CgQQAsX76cM888k6ysLM466yw+/fTTcAwXX3wxAwYMoFOnTkyYMKHK7ZSbm8utt95KZmYm3bp1Y/ny5VW+BqB9+/bhX1WnpqbSpUsXvvnGOUt+4oknsn37dr799tuY2qopKzpnjKmQ12Wop0+fztNPPw3AkUceyeLFiyuMZevWrdxwww0sWbKEjh078t//VjY6HUaOHMmMGTPo27cv48ePjzpP586dWbp0KY0bN+btt99m0qRJvPzyywCsXr2aVatW0aRJE0499VTGjBnD8ccfH7WdkD179rB69WqWLFnCddddx9q1a1m8eDHjxo07bN5mzZoddhqtsLCQVatWccYZP/6kqlevXuTn53PJJZdUuuzasERgjF+FjgK+eq/s45Gvx20RXpehrs6poWXLltG3b9/wRdnbtGlT4bw7duxgx44d9O3bF4CrrroqatG64uJirrnmGj7//HNEJFx8DqB///60atUKgIyMDL766qsqE8Hw4cMBp67Szp072bFjB/369YupzPSuXbu45JJL+NOf/lSmCms8ykxXxRKBMaZCdVGGurzGjRtz6NCh8OPqXoi9Ou655x769evH3LlzKSwsJDc3N/xctHLWVYn2S+9YjghKS0u55JJLGDFiRPg6CyHxKDNdFV/8jsAYE8XI153biWc7t9DjOKqLMtTlpaens3r16vAyQufa+/Tpw5IlS9i0aRNApaeGWrduTevWrXnvPedo6Zlnnok6X3FxMccd5/yO9YknnqgyNoCrr766wvP/L7zwAgDvvfcerVq1olWrVuEjgvK3UBJQVa6//nq6dOnCbbfddlib8SgzXRVLBMaYCnldhjrUsRy6FRYWkpOTQ8eOHcnIyGDs2LHhfoWjjjqKmTNncvHFF9OzZ08uv/zySmOfPXs2o0ePJjMzs8Ky3BMmTODOO+8kKysrpm/84FxusqLy0U2bNiUrK4tRo0bxt7/9Lab28vPzeeqpp3j33XfD22HhwoWAk3g3btzo/SgnVfXVrXfv3mpMQ7d+/fpEh+CJsWPH6ltvvZXoMGqsuLhYhw4dGvW5c889V1esWBHX5b3yyit69913V/t10f5/gAKt4HPVjgiMMXVm0qRJ7NmzJ9Fh1FjLli2ZM2dOnS3vwIED/OY3v/F8OdZZbIypMw25DHXotw7xdOmll8a9zWjsiMAYYwLOEoExxgScJQJjjAk4SwTGGBNwlgiMMWG1KUMda3nniiqT1sSqVau4/vrra93O9ddfT8+ePenRowdDhw5l165dAMyYMYPHH3+81u3Xd6IV/NCivsrOztaCAitUahq2Tz75hC5duiQ0hmhlqA8cOEDjxvVnsOGll17K3XffTc+ePWvVzs6dO8P1fW677TbatWvHxIkT2bNnDzk5OaxatSoe4daZaP8/IrJSVaP+Ms2OCIzxuZL9JQx+dTAl+6OXdqita6+9llGjRnHGGWcwYcKECks3R5Z3njJlCtdddx25ubmcdNJJPPzww+H2WrRoEZ4/NzeXoUOH0rlzZ0aMGBH+BfDChQvp3LkzvXv3ZuzYseF2y6x3SQlr1qwJJ4EpU6Zw1VVXceaZZ9KpUydmzZoV8zqGkoCqsnfv3nDNoGbNmpGenh5zSWm/qj+p3RhTI0uKlvBl8ZcsLVrKBSdd4MkyioqKeP/990lKSmLnzp0Vlm6OtGHDBhYvXkxJSQmnnnoqN998M8nJyWXmWbVqFevWrePYY48lJyeH/Px8srOzuemmm8LlpkMVPcsrKCg4rAbPmjVrWLZsGbt37yYrK4uBAweSmprKOeecE7WNZ599loyMDMApW71w4UIyMjJ46KGHwvNkZ2ezdOlSTj/99GptMz+xRGCMT0345wTyivLYf9A5dz/pvUlM+WAKuR1yeeDcB+K6rEsvvZSkpCSg8tLNkQYOHEiTJk1o0qQJ7dq1Y8uWLXTo0KHMPKeffnp4WqjWUIsWLTjppJPC5aaHDx/OzJkzD2s/WlnrwYMHk5KSQkpKCv369WP58uUMGTIkpjLQs2fP5uDBg4wZM4YXXniBkSOdiya2a9eODRs2VPl6P7NTQ8b41C1Zt9C+eXuSGznfspMbJdO+eXvGZI2J+7KaN28evh8q3bx27Vpee+21CstEx1LGuSalnkNSUlIOW3a0MtAlJSVlCttF3tavX19m/qSkJIYNG1bmCKcuykAnmiUCY3zqhJYnMDpzNKWHSklpnELpoVJGZ47m+JaVXzyltmpSurk6Tj31VL788ksKCwuBH0s7lxetrPW8efPYt28f27dvJy8vj9NOO43U1NSoZaBXr15NRkYGqhpuR1WZP38+nTt3DrdZF2WgE80SgTE+tqhwESmNUxidOZqUxim8Wfim58usSenm6khJSeHRRx9lwIAB9O7dm9TU1PCVwiJ17tyZ4uLiMtc/6NGjB/369aNPnz7cc889FZaLjqSqXHPNNXTv3p3u3buzefNmJk+eHH4+Pz8/fIW2hsqGjxpTD8U6fHTttrUc0/wY0lLS2LZ3G1t2b6FrWtc6iNBbu3btokWLFqgqo0ePplOnTlGv8jV9+nRSU1P51a9+FXW4a22tWrWKP/7xjzz11FNxa7Mu2PBRYwKkW1o30lLSAEhLSWsQSQBg1qxZZGZm0rVrV4qLi7npppuiznfzzTeX6WeIt23btvHb3/7Ws/brCzsiMKYeqg8/KDP+ZUcExhhjqsUSgTHGBJwlAmOMCThLBMYYE3CWCIwxYUEtQx0yduzYcFE8CE4Zaqs1ZIwJa9u2bbguT3XLUGdnZ5OdHXVQShnvv/9+XGIF+N3vfsfdd98dl7YKCgr4/vvvy0y77rrryMnJ4brrrovLMuorOyIwxqc+7Z3NJ527HHb7tHfVH8bVEYQy1AcPHmT8+PE88EDZYn1WhtoYU68d2r27WtNro6GXoZ4xYwYXXXQR7du3P2weK0NdSyIyAPgzkAT8VVWnlXv+BODvQGt3nomqutDLmIwx1deQy1D/5z//Yc6cOeTl5UV9PghlqD1LBCKSBDwCnA8UAStEZL6qRtZ9vRt4UVX/IiIZwEIg3auYjDE1E60M9dy5cyksLCQ3Nzfqa+pTGerKjgg2bdrExo0bOfnkkwHYs2cPJ598crgiaRDKUHt5RHA6sFFVvwQQkeeBwUBkIlCgpXu/FfAfD+MxxsRBXZahTk9Pr7QMdeSVxMApQ33nnXeye/du8vLymDZtWrgMdUUyMjL49ttvw49btGhRprz1Z599Rk5OTu1Wqp7zsrP4OODriMdF7rRIU4ArRaQI52gg6hU1RORGESkQkYKtW7d6EasxJkYNrQx1VawMdW0aFhkKDFDVX7mPrwLOUNVbIua5zY3hIRE5E/gb0E1VD1XUrhWdM0EQS9G5T3tnR+0YbtS8Oaeu9Pd7xMpQ1051i855eWroGyDyUkkd3GmRrgcGAKjqByLSFEgDvvMwLmN8QVUPO+cdye8f9pWZNWsWf//739m/fz9ZWVmVlqGeM2eOZ3H4sQx1Tb7ce3lE0Bj4DOiPkwBWAFeo6rqIef4BvKCqT4hIF+Ad4DitJCg7IjBBsGnTJlJTU2nbtm2lycCYSKrK9u3bKSkpCY+6CknIEYGqHhCRW4BFOENDH1fVdSIyFShQ1fnAb4BZIjIOp+P42sqSgDFB0aFDB4qKirA+MVNdTZs2PWyYblXswjTGGBMAdmEaY4wxFbJEYIwxAWeJwBhjAs4SgTHGBJwlAmOMCThLBMYYE3CWCIwxJuAsERhjTMBZIjDGmIBrkImg/K+l4/HraS/aNMaY+qDBJYLpb33G1AXrwx/UqsrUBeuZ/tZn9apNY4ypLxpUIlBVdu4rZXZ+YfiDe+qC9czOL2TnvtIafYv3ok1jjKlPPL14fV0TESYPygBgdn4hs/MLARiZk87kQRk1KufrRZvGGFOfNKgjAij7wR1S2w9sL9o0xpj6osElgtCpm0iR5/frS5vGGFNfNKhEEHn+fmROOpt+fwEjc9LLnN+vD20aY0x90uD6CFo2TS5z/j50Sqdl0+Qa9xHEu01jjKlPGuQVyspf9Luqi4DHwos2jTGmrgTuCmXlP6Dj8YHtRZvGGFMfNMhEYIwxJnaWCIwxJuAsERhjTMBZIjDGmICrMhGIyIUiYgnDGGMaqFg+4C8HPheRB0Sks9cBGWOMqVtVJgJVvRLIAr4AnhCRD0TkRhFJ9Tw6Y4wxnovplI+q7gReAp4H2gO/BD4UkTEexmaMMaYOxNJHcJGIzAXygGTgdFX9BdAT+I234RljjPFaLLWGLgGmq+qSyImqukdErvcmLGOMMXUllkQwBdgceiAiKcDRqlqoqu94FZgxxpi6EUsfwRzgUMTjg+60KonIABH5VEQ2isjECua5TETWi8g6EXk2lnaNMcbETyxHBI1VdX/ogaruF5EjqnqRiCQBjwDnA0XAChGZr6rrI+bpBNwJ5Kjq9yLSrtprYIwxplZiOSLYKiIXhR6IyGBgWwyvOx3YqKpfuonkeWBwuXluAB5R1e8BVPW72MI2xhgTL7EcEYwCnhGRGYAAXwNXx/C649x5Q4qAM8rNcwqAiOQDScAUVX2jfEMiciNwI8AJJ5wQw6KNMcbEqspEoKpfAH1EpIX7eFecl98JyAU6AEtEpLuq7igXw0xgJjgXponj8o0xJvBiulSliAwEugJNQxdkUdWpVbzsG+D4iMcd3GmRioB/qWopsElEPsNJDCtiicsYY0ztxfKDssdw6g2NwTk1dClwYgxtrwA6iUhHt3N5GDC/3Dyv4hwNICJpOKeKvowxdmOMMXEQS2fxWap6NfC9qv4vcCbuuf3KqOoB4BZgEfAJ8KKqrhORqRGdz4uA7SKyHlgMjFfV7TVZkcPMHujcjDHGVCqWU0P73L97RORYYDtOvaEqqepCYGG5aZMj7itwm3szxhiTALEkgtdEpDXwB+BDQIFZXgZVK6GjgK/eK/t45OuJiccYY+q5ShOBe0Gad9xRPC+LyAKgqaoW10VwxhhjvFdpIlDVQyLyCM71CFDVH4Af6iKwGgt987cjAWOMiUksncXviMglEho3auLK6Sap+LExxngtlj6Cm3A6cw+IyD6cIaSqqi09jayGPu2dzaHdu3+ccH8XABo1b86pKwsSFFV009/6jJ37Spk8KAMRQVWZumA9LZsmM+78KgdmGWNMXMRyqcpUVW2kqkeoakv3cb1MAkDZJBDD9ERRVXbuK2V2fiFTF6wPJ4HZ+YXs3FdqRwbGmDpT5RGBiPSNNr38hWpM9YgIkwdlADA7v5DZ+YUAjMxJDx8hGGNMXYjl1ND4iPtNcaqKrgR+6klEARJKBqEkAFgSMMbUuVhODV0YcTsf6AZ8731oDV/odFCk0GkiY4ypK7GMGiqvCOgS70CCJrJPYGROOpt+fwEjc9LL9BkYY0xdiKWP4P/i/JoYnMSRifML43qpUfPmUTuGGzVvnoBoKiYitGyaXKZPINRn0LJpsp0eqiVVLbMNyz8OAj9tAz/FCv6LtypS1TdPEbkm4uEBoFBV8z2NqhLZ2dlaUFC/hoHWRkP7h6oPbFiuv7aBn2IF/8UbIiIrVTU72nOxnBp6CXhaVf+uqs8Ay0SkWVwjDLDyH/qWBGrHhuX6axv4KVbwX7yxiuWIYBlwXujKZO6Vyt5U1bPqIL7DNLQjAhN/kW/OkKANy/XTNvBTrOC/eENqe0TQNPLylO59OyIw9VZkf0tIfX+TxpuftoGfYgX/xRuLWBLBbhHpFXogIr2Bvd6FZEzt2LBcf20DP8UK/os3FrEkgl8Dc0RkqYi8B7yAc+UxY+odG5brr23gp1jBf/HGqsrho6q6QkQ6A6e6kz51LzZvAsYPI5y8Hpbr1TaIZ7t+2gZ+G0btZbyJfH/F0lk8GnjGvTgNInIkMFxVH/U+vMMltLM4wNc48NuQOS/eVF5tA6/a9dM28MOXjEjxjrcu3l+17Sy+IZQEAFT1e+CGuERmfMGPQ+biPSzXq23g5bb1yzbwIlavxTPe+vD+iuWI4GOgh7ozikgSsEZVu3oeXRQJOSIofx3kE892/gboyMCvQ+biyatt4Kdt66dY/aQutmttjwjeAF4Qkf4i0h94DvhHXCIzvtEQh8xVl1fbwE/b1k+x+kmit2ssieAO4F1glHv7GEjxMqh6Z+Trzu3Es51b6HGANMQhc9Xl1Tbw07b1U6x+kujtGksZ6kPAv4BCnGsR/BT4xNuwTG2V/weqzT9UQx0yVx1ebQM/bVs/xRoSz/eBV+rDdq1w+KiInAIMd2/bcH4/gKr28zyq+irORwGHXV/ZVdvrK8d7BEKdDPGr5yOyvNoGZdrdNh55Qph87YJatxtSsr+EKxdeydMXPE3qEam1astvQz39MtKtPmzXyn5HsAFYCgxS1Y1uwOM8jyhAvLi+cuQIBHDOM0Z+26jpMLdx559S5rWhf9b69ub3klfbINzuE/HftkuKlvBl8ZcsLVrKBSddUOv2/PJ/4NX7wCuJ3q4VjhoSkSHAMCAHp8P4eeCvqtqxTiKrQEMqOvdJ54qv79NlQ83PvvlqZIeNyPJkG0z45wTyivLYf3A/B/UgSZLEEUlHkNshlwfOfaCWAfuDr94HdaBGo4ZU9VVVHQZ0BhbjlJpoJyJ/EZGfeRKpiYtEj0AwiXdL1i20b96e5EbJACQ3SqZ98/aMyRqT4Mjqjr0PYhdLZ/FuVX1WVS8EOgCrcEYSmXoq0SMQqsVGZHmyDU5oeQKjM0dTeqiUlMYplB4qZXTmaI5veXycgq7/fPU+SLBqXbNYVb9X1Zmq2t+rgEzt1IcRCKZ+WFS4iJTGKYzOHE1K4xTeLHwz0SHVGXsfVE+VReeMd7y4vnJ9GIFQXaqKRHwDrm8deXUmzkdCI7uN5M4z7iQtJY2BJw1ky+4tcW2/PvPj+yCRqiwxUavGRQYAfwaScDqap1Uw3yU4l8Q8TVUr7QluSJ3FISWzf8GV8i1PX7Gk1kP8QpxRKIOcByNfj9+Ha5yHeX7csxeNfzj88hYHmqTQ/aMPa9d4PR+SGuLFMGKvhiZ71S7gyf7yWzE7L9W2xERNF5oEPAL8AsgAhotIRpT5UoFbcX60FkhL2MuXcoClRUvj1qYfinipatQkAND4h72BOXz3YhixF2162a5X/PA+qA+8PDV0OrBRVb8EEJHngcHA+nLz/Ra4HxjvYSz10oQnziCPveyXQ4AwackdTFkykdyOA2o3xK/8cMR4fNPyoM2q3pQ1ftN6sf7GO7a/Es6zIwLgOODriMdF7rQwcS6BebyqVrrHReRGESkQkYKtW7fGP9IEuUVb0Z4kkt1vvslAe5ICNcTPGJN4CessFpFGwB+Ba6uaV1VnAjPB6SPwNrK6c8LINxld+CYT8n5Digr7GzVidN/7az/EL/RNKp7frLxo0yt+itXY/qoHvDwi+AaI/ETr4E4LSQW6AXkiUgj0AeaLSNTOjIZqUeEiUhBGa6tADvGrzfPGmPjw8ohgBdBJRDriJIBhwBWhJ1W1GEgLPRaRPOD2qkYNJVI8C3iFlBnit3dbfIf4efDNqmTE88422F8SlyJmB5qkVDhqqNYdez4pEujFMGIv2vSyXYjv/1YZdqRRJc8SgaoeEJFbgEU4w0cfV9V1IjIVKFDV+V4t2yvxLuAF0C2tW/h+WkoaaSlplcydePHeBt0/+tA3Q/y8GjFT62GXddSml+2CN+8vExtPf0fghUT8jsAKeNk2AO+KBAadZ/9bVtCwjIT8jqAhsQJetg2Md+x/K/EsEcTACnjZNjDe8ex/ywoaxswSQYyCXMArxLaB8Yr9byWW9RHEaO22tRzT/BjSUtLY5o7u6ZrWtc7jSKSgbwNP6+wEXND/t+pCZX0Elghi4NcPAC+Gu/qB1/vLiyKBQeb5/orz+8DLeL3837LO4lryW6GtkMjheEHi9f7yokhgkHm+v+L8PvAy3kT9b9kRQQz8Nmww6EM9vdpf4SKBHOKgCEkKRyC1LxIYcJ7tL4/eB17EWxf/W3ZEEDA2HM8bViTQX/z0Pkj0/5YlggbIhnp644SRbzI6935KpREpKpQ2SmJ0bhyKBBpP+Ol9kOj/LUsEDZQNx/NGkIsE+pGf3geJ/N+yPoIY+G3UkJ8ufegFL2O1YY7x57fLavr1/VVZH4FdvD4G9e2Drip+uvShF7zcX34rEugHXu0vz4sExrGqaaLfX5YIjDGmOhrgpTWtj8AYYwLOjgiMMaY6GuClNe2IwBhjAs6OCBogP1360BiveP4/G8cjgUS/v2z4qDHGBICVmDDGGFMhSwTGGBNwlgiMMSbgLBEYY0zAWSIwxpiAs0RgjDEBZ4nAGGMCzhKBMcYEnCUCY4wJOEsExhgTcJYIjDEm4CwRGGNMwHmaCERkgIh8KiIbRWRilOdvE5H1IrJGRN4RkRO9jMcYY8zhPEsEIpIEPAL8AsgAhotIRrnZVgHZqtoDeAl4wKt4jDHGROflEcHpwEZV/VJV9wPPA4MjZ1DVxaq6x324DOjgYTzGGGOi8DIRHAd8HfG4yJ1WkeuBf0R7QkRuFJECESnYunVrHEM0xhhTLzqLReRKIBv4Q7TnVXWmqmaravZRRx1Vt8EZY0wD5+WlKr8Bjo943MGdVoaInAfcBZyrqj94GI8xxpgovDwiWAF0EpGOInIEMAyYHzmDiGQB/w+4SFW/8zAWY4wxFfAsEajqAeAWYBHwCfCiqq4TkakicpE72x+AFsAcEVktIvMraM4YY4xHvDw1hKouBBaWmzY54v55Xi7fGGNM1epFZ7ExxpjEsURgjDEBZ4nAGGMCzhKBMcYEnCUCY4wJOEsExhgTcJYIjDEm4CwRGGNMwFkiMMaYgLNEYIwxAWeJwBhjAs4SgTHGBJwlAmOMCThLBMYYE3CWCIwxJuAsERhjTMBZIjDGmICzRGCMMQFnicAYYwLOEoExxgScJQJjjAk4SwTGGBNwlgiMMSbgLBEYY0zAWSIwxpiAs0RgjDEBZ4nAGGMCzhKBMcYEnCUCY4wJOEsExhgTcJYIjDEm4DxNBCIyQEQ+FZGNIjIxyvNNROQF9/l/iUi6l/EYY4w5nGeJQESSgEeAXwAZwHARySg32/XA96p6MjAduN+reIwxxkTn5RHB6cBGVf1SVfcDzwODy80zGPi7e/8loL+IiIcxGWOMKaexh20fB3wd8bgIOKOieVT1gIgUA22BbZEziciNwI3uwx9EZK0nESdWGuXWu4Gw9fIXWy9/qc56nVjRE14mgrhR1ZnATAARKVDV7ASHFHe2Xv5i6+Uvtl6V8/LU0DfA8RGPO7jTos4jIo2BVsB2D2MyxhhTjpeJYAXQSUQ6isgRwDBgfrl55gPXuPeHAu+qqnoYkzHGmHI8OzXknvO/BVgEJAGPq+o6EZkKFKjqfOBvwFMishH4L06yqMpMr2JOMFsvf7H18hdbr0qIfQE3xphgs18WG2NMwFkiMMaYgPNVIqiqZIVfiUihiHwsIqtFpCDR8dSUiDwuIt9F/s5DRNqIyFsi8rn798hExlgTFazXFBH5xt1nq0XkgkTGWBMicryILBaR9SKyTkRudaf7dp9Vsk4NYX81FZHlIvKRu27/607v6Jbo2eiW7Dmi2m37pY/ALVnxGXA+zo/TVgDDVXV9QgOLAxEpBLJV1dc/eBGRvsAu4ElV7eZOewD4r6pOc5P3kap6RyLjrK4K1msKsEtVH0xkbLUhIu2B9qr6oYikAiuBIcC1+HSfVbJOl+H//SVAc1XdJSLJwHvArcBtwCuq+ryIPAZ8pKp/qU7bfjoiiKVkhUkgVV2CM/orUmQZkb/jvCl9pYL18j1V3ayqH7r3S4BPcH7t79t9Vsk6+Z46drkPk92bAj/FKdEDNdxffkoE0UpWNIgdjLMz3xSRlW45jYbkaFXd7N7/Fjg6kcHE2S0issY9deSb0yfRuJV/s4B/0UD2Wbl1ggawv0QkSURWA98BbwFfADtU9YA7S40+F/2UCBqys1W1F06l1tHuqYgGx/2xoD/ORVbtL8BPgExgM/BQQqOpBRFpAbwM/FpVd0Y+59d9FmWdGsT+UtWDqpqJU6nhdKBzPNr1UyKIpWSFL6nqN+7f74C5ODu4odjinrcNnb/9LsHxxIWqbnHflIeAWfh0n7nnml8GnlHVV9zJvt5n0dapoeyvEFXdASwGzgRauyV6oIafi35KBLGUrPAdEWnudmohIs2BnwENqbpqZBmRa4B5CYwlbkIflK5f4sN95nY+/g34RFX/GPGUb/dZRevUQPbXUSLS2r2fgjNw5hOchDDUna1G+8s3o4YA3CFff+LHkhX3JTai2hORk3COAsAp+fGsX9dLRJ4DcnFK424B7gVeBV4ETgC+Ai5TVV91vFawXrk4pxkUKARuijiv7gsicjawFPgYOOROnoRzTt2X+6ySdRqO//dXD5zO4CScL/EvqupU9zPkeaANsAq4UlV/qFbbfkoExhhj4s9Pp4aMMcZ4wBKBMcYEnCUCY4wJOEsExhgTcJYIjDEm4CwRmMBwq1L+vNy0X4tIhQW6RCRPRDy96LmIPOeWPhhXbvoTIjK0otcZEy+eXarSmHroOZwfIi6KmDYMmJCYcEBEjgFOU9WTExWDMXZEYILkJWBgqF67W5TsWGCpiPxFRAoi67yXJyK7Iu4PFZEn3PtHicjLIrLCveVEeW1TEZktznUnVolIP/epN4Hj3Br551QUuIj81j1CSKrhuhtTITsiMIGhqv8VkeU4xf3m4RwNvKiqKiJ3uc8nAe+ISA9VXRNj038GpqvqeyJyAs4RR5dy84x2QtDuItIZp9rsKcBFwAK3kFhUIvIHIBUYqfYLUOMBOyIwQRM6PYT79zn3/mUi8iHOT/S7AhnVaPM8YIZbHng+0NKtfhnpbOBpAFXdgFO64ZQY2r4HaKWqoywJGK/YEYEJmnnAdBHpBTRT1ZUi0hG4Hedc/ffuKZ+mUV4b+UEc+XwjoI+q7vMg3hVAbxFp45d6P8Z/7IjABIp7hafFwOP8eDTQEtgNFIvI0TinjqLZIiJdRKQRTgXLkDeBMaEHIpIZ5bVLgRHu86fgFHT7NIaQ3wCmAa+HqtQaE2+WCEwQPQf0dP+iqh/hnBLaADwL5FfwuonAAuB9nIubhIwFst0hoOuBUVFe+yjQSEQ+Bl4Aro21QqSqzsGpoT/fLT9sTFxZ9VFjjAk4OyIwxpiAs0RgjDEBZ4nAGGMCzhKBMcYEnCUCY4wJOEsExhgTcJYIjDEm4P4/J7yZq61p3k8AAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# let's plot the accuracy on the training set\n",
"import matplotlib.pyplot as plt\n",
"plt.scatter(k_values,accuracy_training_k_p[0],marker=\"x\")\n",
"plt.scatter(k_values,accuracy_training_k_p[1],marker=\"x\")\n",
"plt.scatter(k_values,accuracy_training_k_p[2],marker=\"*\")\n",
"plt.scatter(k_values,accuracy_training_k_p[3],marker=\"s\")\n",
"plt.xlim([0, max(k_values)+2])\n",
"plt.ylim([0.0, 1.1])\n",
"plt.xlabel(\"Value of k\")\n",
"plt.ylabel(\"Accuracy\")\n",
"legend_labels = [\"Training (Manhattan, p=1)\",\"Training (Euclidian, p=2)\",\"Training (p=3)\",\"Training (p=4)\"]\n",
"plt.legend(labels=legend_labels, loc=1, borderpad=0.2)\n",
"plt.title(\"Effect of k and p on training set accuracy\", fontsize=10)\n",
"plt.show()\n",
"\n",
"# let's plot the accuracy on the test set\n",
"import matplotlib.pyplot as plt\n",
"plt.scatter(k_values,accuracy_test_k_p[0],marker=\"x\")\n",
"plt.scatter(k_values,accuracy_test_k_p[1],marker=\"+\")\n",
"plt.scatter(k_values,accuracy_test_k_p[2],marker=\"*\")\n",
"plt.scatter(k_values,accuracy_test_k_p[3],marker=\"s\")\n",
"plt.xlim([0, max(k_values)+2])\n",
"plt.ylim([0.0, 1.1])\n",
"plt.xlabel(\"Value of k\")\n",
"plt.ylabel(\"Accuracy\")\n",
"legend_labels = [\"Test (Manhattan, p=1)\",\"Test (Euclidian, p=2)\",\"Training (p=3)\",\"Training (p=4)\"]\n",
"plt.legend(labels=legend_labels, loc=1, borderpad=0.2)\n",
"plt.title(\"Effect of k and p on test set accuracy\", fontsize=10)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"w = uniform ,p = 1 training [1.0, 0.8306451612903226, 0.8064516129032258, 0.75, 0.75, 0.717741935483871, 0.7016129032258065, 0.6854838709677419, 0.6451612903225806, 0.6854838709677419, 0.6451612903225806, 0.5806451612903226, 0.5564516129032258, 0.5645161290322581, 0.5645161290322581] \n",
"\n",
"w = uniform ,p = 1 test [0.7, 0.7, 0.5333333333333333, 0.43333333333333335, 0.4666666666666667, 0.4, 0.5, 0.4666666666666667, 0.4666666666666667, 0.5, 0.4666666666666667, 0.5, 0.43333333333333335, 0.5, 0.4666666666666667] \n",
"\n",
"w = uniform ,p = 2 training [1.0, 0.8306451612903226, 0.7580645161290323, 0.6854838709677419, 0.6370967741935484, 0.6048387096774194, 0.5806451612903226, 0.5967741935483871, 0.5645161290322581, 0.5241935483870968, 0.5161290322580645, 0.46774193548387094, 0.46774193548387094, 0.46774193548387094, 0.43548387096774194] \n",
"\n",
"w = uniform ,p = 2 test [0.5666666666666667, 0.4666666666666667, 0.4, 0.4, 0.4, 0.43333333333333335, 0.4, 0.3333333333333333, 0.43333333333333335, 0.4, 0.4, 0.4, 0.36666666666666664, 0.3, 0.3333333333333333] \n",
"\n",
"w = uniform ,p = 3 training [1.0, 0.8064516129032258, 0.7096774193548387, 0.6774193548387096, 0.6290322580645161, 0.6129032258064516, 0.5483870967741935, 0.5403225806451613, 0.5161290322580645, 0.49193548387096775, 0.4596774193548387, 0.43548387096774194, 0.4435483870967742, 0.4435483870967742, 0.41935483870967744] \n",
"\n",
"w = uniform ,p = 3 test [0.5333333333333333, 0.36666666666666664, 0.4, 0.3333333333333333, 0.36666666666666664, 0.4, 0.3333333333333333, 0.3333333333333333, 0.4, 0.43333333333333335, 0.4, 0.36666666666666664, 0.3333333333333333, 0.3333333333333333, 0.3333333333333333] \n",
"\n",
"w = uniform ,p = 4 training [1.0, 0.7983870967741935, 0.6774193548387096, 0.6532258064516129, 0.6370967741935484, 0.5887096774193549, 0.5403225806451613, 0.5080645161290323, 0.5080645161290323, 0.47580645161290325, 0.45161290322580644, 0.41935483870967744, 0.4435483870967742, 0.41935483870967744, 0.4274193548387097] \n",
"\n",
"w = uniform ,p = 4 test [0.5333333333333333, 0.3333333333333333, 0.43333333333333335, 0.3, 0.3, 0.4, 0.3333333333333333, 0.36666666666666664, 0.4, 0.4, 0.4, 0.3333333333333333, 0.3, 0.3333333333333333, 0.3] \n",
"\n",
"w = distance ,p = 1 training [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] \n",
"\n",
"w = distance ,p = 1 test [0.7, 0.7333333333333333, 0.6333333333333333, 0.5, 0.5333333333333333, 0.43333333333333335, 0.4666666666666667, 0.4666666666666667, 0.4666666666666667, 0.5333333333333333, 0.4666666666666667, 0.43333333333333335, 0.4666666666666667, 0.5, 0.4666666666666667] \n",
"\n",
"w = distance ,p = 2 training [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] \n",
"\n",
"w = distance ,p = 2 test [0.5666666666666667, 0.5333333333333333, 0.5, 0.4666666666666667, 0.36666666666666664, 0.4, 0.4, 0.43333333333333335, 0.4, 0.43333333333333335, 0.43333333333333335, 0.4, 0.43333333333333335, 0.4, 0.36666666666666664] \n",
"\n",
"w = distance ,p = 3 training [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] \n",
"\n",
"w = distance ,p = 3 test [0.5333333333333333, 0.4666666666666667, 0.4666666666666667, 0.3333333333333333, 0.36666666666666664, 0.3333333333333333, 0.36666666666666664, 0.4, 0.4, 0.4, 0.36666666666666664, 0.4, 0.4, 0.4, 0.4] \n",
"\n",
"w = distance ,p = 4 training [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] \n",
"\n",
"w = distance ,p = 4 test [0.5333333333333333, 0.43333333333333335, 0.43333333333333335, 0.3, 0.3333333333333333, 0.36666666666666664, 0.36666666666666664, 0.4, 0.4, 0.36666666666666664, 0.36666666666666664, 0.36666666666666664, 0.36666666666666664, 0.4, 0.3333333333333333] \n",
"\n"
]
}
],
"source": [
"# Now let's explore the impact of using a different weighting scheme\n",
"w_values = [\"uniform\",\"distance\"]\n",
"accuracy_training_k_p_w = []\n",
"accuracy_test_k_p_w = []\n",
"\n",
"for i in range(len(w_values)):\n",
" accuracy_training_k_p_w.append([])\n",
" accuracy_test_k_p_w.append([])\n",
" \n",
" for j in range(len(p_values)):\n",
" accuracy_training_k_p_w[i].append([])\n",
" accuracy_test_k_p_w[i].append([]) \n",
"\n",
" for k in k_values:\n",
" model_k_p_w = neighbors.KNeighborsClassifier(n_neighbors=k, p=p_values[j], weights=w_values[i])\n",
" model_k_p_w.fit(X_training, y_training)\n",
"\n",
" # compute the predictions for the training and test sets\n",
" predictions_training_k_p_w = model_k_p_w.predict(X_training)\n",
" predictions_test_k_p_w = model_k_p_w.predict(X_test)\n",
"\n",
" # compute the accuracy on the training and test set predictions\n",
" accuracy_training_k_p_w[i][j].append(metrics.accuracy_score(y_training, predictions_training_k_p_w))\n",
" accuracy_test_k_p_w[i][j].append(metrics.accuracy_score(y_test, predictions_test_k_p_w))\n",
"\n",
" print(\"w =\",w_values[i],\",p =\",p_values[j],\"training\",accuracy_training_k_p_w[i][j],\"\\n\")\n",
" print(\"w =\",w_values[i],\",p =\",p_values[j],\"test\",accuracy_test_k_p_w[i][j],\"\\n\")"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"# let's plot the accuracy on the training set\n",
"import matplotlib.pyplot as plt\n",
"plt.scatter(k_values,accuracy_training_k_p_w[0][0],marker=\"x\")\n",
"plt.scatter(k_values,accuracy_training_k_p_w[0][1],marker=\"+\")\n",
"plt.scatter(k_values,accuracy_training_k_p_w[1][0],marker=\"*\")\n",
"plt.scatter(k_values,accuracy_training_k_p_w[1][1],marker=\"s\")\n",
"plt.xlim([0, max(k_values)+2])\n",
"plt.ylim([0.0, 1.1])\n",
"plt.xlabel(\"Value of k\")\n",
"plt.ylabel(\"Accuracy\")\n",
"legend_labels = [\"Training (Manhattan dist.,w=uniform)\",\"Training (Euclidian dist.,w=uniform)\",\n",
" \"Training (Manhattan dist.,w=distance)\",\"Training (Euclidian dist.,w=distance)\"]\n",
"plt.legend(labels=legend_labels, loc=4, borderpad=0.2)\n",
"plt.title(\"Effect of k and p and w on training set accuracy\", fontsize=10)\n",
"plt.show()\n",
"\n",
"# let's plot the accuracy on the test set\n",
"import matplotlib.pyplot as plt\n",
"plt.scatter(k_values,accuracy_test_k_p_w[0][0],marker=\"x\")\n",
"plt.scatter(k_values,accuracy_test_k_p_w[0][1],marker=\"+\")\n",
"plt.scatter(k_values,accuracy_test_k_p_w[1][0],marker=\"*\")\n",
"plt.scatter(k_values,accuracy_test_k_p_w[1][1],marker=\"s\")\n",
"plt.xlim([0, max(k_values)+2])\n",
"plt.ylim([0.0, 1.1])\n",
"plt.xlabel(\"Value of k\")\n",
"plt.ylabel(\"Accuracy\")\n",
"legend_labels = [\"Test (Manhattan dist.,w=uniform)\",\"Test (Euclidian dist.,w=uniform)\",\n",
" \"Training (Manhattan dist.,w=distance)\",\"Training (Euclidian dist.,w=distance)\"]\n",
"plt.legend(labels=legend_labels, loc=1, borderpad=0.2)\n",
"plt.title(\"Effect of k and p and w on test set accuracy\", fontsize=10)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Max test set accuracy: 0.7333333333333333\n",
"Index of max test set accuracy: (1, 0, 1)\n",
"Hyperparameter values: w = distance p = 1 k = 3\n"
]
}
],
"source": [
"# let's find the best test set accuracy, using numpy\n",
"import numpy as np\n",
"np_array = np.array(accuracy_test_k_p_w)\n",
"max_index = np.unravel_index(np_array.argmax(), np_array.shape)\n",
"print(\"Max test set accuracy:\",np_array.max())\n",
"print(\"Index of max test set accuracy:\",max_index)\n",
"print(\"Hyperparameter values: w =\",w_values[max_index[0]],\"p =\",p_values[max_index[1]],\"k =\",k_values[max_index[2]]) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Key take away points from this worked example:\n",
"* Distance weighting seems to work better than uniform weighting on the beer dataset\n",
"* Manhattan distance seems to work best of the 4 distance metrics tested on the beer dataset\n",
"* Lower values of k seem to work better thna higher values on the beer dataset\n",
"* It is important to look at accuracy on both the training and test sets when deciding on model parameters\n",
"* Training set accuracy is usually much higher than test set accuracy\n",
"* The best test set accuracy we found on the beer dataset was with distance weighting, Manhattan distance and k=3. This combination also achieves 100% training set accuracy\n",
"* We have manually explored hyperparameter values in this example, however scikit-learn provides a class called GridSearchCV which can automate the hyperparameter search process (we will cover this in a later lecture) "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.7"
}
},
"nbformat": 4,
"nbformat_minor": 4
}