{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\"banner\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Working with Watson Machine Learning" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook should be run in a Watson Studio project, using **Default Spark Python** runtime environment. **If you are viewing this in Watson Studio and do not see Python 3.7 in the upper right corner of your screen, please update the runtime now.** It requires service credentials for the following Cloud services:\n", " * Watson OpenScale\n", " * Watson Machine Learning V2 plans\n", " * Cloud Object Storage\n", " \n", "If you have a paid Cloud account, you may also provision a **Databases for PostgreSQL** or **Db2 Warehouse** service to take full advantage of integration with Watson Studio and continuous learning services. If you choose not to provision this paid service, you can use the free internal PostgreSQL storage with OpenScale, but will not be able to configure continuous learning for your model.\n", "\n", "The notebook will train, create and deploy a German Credit Risk model, configure OpenScale to monitor that deployment, and inject seven days' worth of historical records and measurements for viewing in the OpenScale Insights dashboard." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Contents\n", "\n", "- [Setup](#setup)\n", "- [Model building and deployment](#model)\n", "- [OpenScale configuration](#openscale)\n", "- [Quality monitor and feedback logging](#quality)\n", "- [Fairness, drift monitoring and explanations](#fairness)\n", "- [Custom monitors and metrics](#custom)\n", "- [Historical data](#historical)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Note: These samples are using most latest openscale V2 client from public pypi. It does not cover followings aspects for now:\n", "\n", "- Historical payload logging \n", "- Historical manual labeling" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Setup " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Spark check" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install pyspark==2.4.0 --no-cache | tail -n 1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "try:\n", " from pyspark.sql import SparkSession\n", "except:\n", " print('Error: Spark runtime is missing. If you are using Watson Studio change the notebook runtime to Spark.')\n", " raise " ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Package installation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "!rm -rf /home/spark/shared/user-libs/python3.7*\n", "\n", "!pip install --upgrade pandas==0.25.3 --no-cache | tail -n 1\n", "!pip install --upgrade requests==2.23 --no-cache | tail -n 1\n", "!pip install --upgrade numpy==1.20.3 --user --no-cache | tail -n 1\n", "!pip install SciPy --no-cache | tail -n 1\n", "!pip install lime --no-cache | tail -n 1\n", "\n", "!pip install --upgrade ibm-watson-machine-learning --user | tail -n 1\n", "!pip install --upgrade ibm-watson-openscale --no-cache | tail -n 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Provision services and configure credentials" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you have not already, provision an instance of IBM Watson OpenScale using the [OpenScale link in the Cloud catalog](https://cloud.ibm.com/catalog/services/watson-openscale)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Your Cloud API key can be generated by going to the [**Users** section of the Cloud console](https://cloud.ibm.com/iam#/users). From that page, click your name, scroll down to the **API Keys** section, and click **Create an IBM Cloud API key**. Give your key a name and click **Create**, then copy the created key and paste it below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**NOTE:** You can also get OpenScale `API_KEY` using IBM CLOUD CLI.\n", "\n", "How to install IBM Cloud (bluemix) console: [instruction](https://console.bluemix.net/docs/cli/reference/ibmcloud/download_cli.html#install_use)\n", "\n", "How to get api key using console:\n", "```\n", "bx login --sso\n", "bx iam api-key-create 'my_key'\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "CLOUD_API_KEY = \"******\"\n", "IAM_URL=\"https://iam.ng.bluemix.net/oidc/token\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you have not already, provision an instance of IBM Watson OpenScale using the [OpenScale link in the Cloud catalog](https://cloud.ibm.com/catalog/services/watson-openscale).\n", "\n", "Your Cloud API key can be generated by going to the [**Users** section of the Cloud console](https://cloud.ibm.com/iam#/users). From that page, click your name, scroll down to the **API Keys** section, and click **Create an IBM Cloud API key**. Give your key a name and click **Create**, then copy the created key, generate an IAM token using that key and paste it below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### WML credentials example with API key" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "WML_CREDENTIALS = {\n", " \"url\": \"https://us-south.ml.cloud.ibm.com\",\n", " \"apikey\": \"******\"\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### WML credentials example using IAM_token \n", "\n", "**NOTE**: If IAM_TOKEN is used for authentication and you receive unauthorized/expired token error at any steps, please create a new token and reinitiate clients authentication." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ " #uncomment this cell if want to use IAM_TOKEN\n", "# import requests\n", "# def generate_access_token():\n", "# headers={}\n", "# headers[\"Content-Type\"] = \"application/x-www-form-urlencoded\"\n", "# headers[\"Accept\"] = \"application/json\"\n", "# auth = HTTPBasicAuth(\"bx\", \"bx\")\n", "# data = {\n", "# \"grant_type\": \"urn:ibm:params:oauth:grant-type:apikey\",\n", "# \"apikey\": CLOUD_API_KEY\n", "# }\n", "# response = requests.post(IAM_URL, data=data, headers=headers, auth=auth)\n", "# json_data = response.json()\n", "# iam_access_token = json_data['access_token']\n", "# return iam_access_token" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#uncomment this cell if want to use IAM_TOKEN\n", "# IAM_TOKEN = generate_access_token()\n", "# WML_CREDENTIALS = {\n", "# \"url\": \"https://us-south.ml.cloud.ibm.com\",\n", "# \"token\": IAM_TOKEN\n", "# }" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Cloud object storage details\n", "\n", "In next cells, you will need to paste some credentials to Cloud Object Storage. If you haven't worked with COS yet please visit [getting started with COS tutorial](https://cloud.ibm.com/docs/cloud-object-storage?topic=cloud-object-storage-getting-started). \n", "You can find `COS_API_KEY_ID` and `COS_RESOURCE_CRN` variables in **_Service Credentials_** in menu of your COS instance. Used COS Service Credentials must be created with _Role_ parameter set as Writer. Later training data file will be loaded to the bucket of your instance and used as training refecence in subsription. \n", "`COS_ENDPOINT` variable can be found in **_Endpoint_** field of the menu." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "COS_API_KEY_ID = \"******\"\n", "COS_RESOURCE_CRN = \"******\" # eg \"crn:v1:bluemix:public:cloud-object-storage:global:a/3bf0d9003abfb5d29761c3e97696b71c:d6f04d83-6c4f-4a62-a165-696756d63903::\"\n", "COS_ENDPOINT = \"https://******\" # Current list avaiable at https://control.cloud-object-storage.cloud.ibm.com/v2/endpoints" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "BUCKET_NAME = \"******\" #example: \"credit-risk-training-data\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This tutorial can use Databases for PostgreSQL, Db2 Warehouse, or a free internal verison of PostgreSQL to create a datamart for OpenScale.\n", "\n", "If you have previously configured OpenScale, it will use your existing datamart, and not interfere with any models you are currently monitoring. Do not update the cell below.\n", "\n", "If you do not have a paid Cloud account or would prefer not to provision this paid service, you may use the free internal PostgreSQL service with OpenScale. Do not update the cell below.\n", "\n", "To provision a new instance of Db2 Warehouse, locate [Db2 Warehouse in the Cloud catalog](https://cloud.ibm.com/catalog/services/db2-warehouse), give your service a name, and click **Create**. Once your instance is created, click the **Service Credentials** link on the left side of the screen. Click the **New credential** button, give your credentials a name, and click **Add**. Your new credentials can be accessed by clicking the **View credentials** button. Copy and paste your Db2 Warehouse credentials into the cell below.\n", "\n", "To provision a new instance of Databases for PostgreSQL, locate [Databases for PostgreSQL in the Cloud catalog](https://cloud.ibm.com/catalog/services/databases-for-postgresql), give your service a name, and click **Create**. Once your instance is created, click the **Service Credentials** link on the left side of the screen. Click the **New credential** button, give your credentials a name, and click **Add**. Your new credentials can be accessed by clicking the **View credentials** button. Copy and paste your Databases for PostgreSQL credentials into the cell below." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "DB_CREDENTIALS = None\n", "#DB_CREDENTIALS= {\"hostname\":\"\",\"username\":\"\",\"password\":\"\",\"database\":\"\",\"port\":\"\",\"ssl\":True,\"sslmode\":\"\",\"certificate_base64\":\"\"}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__If you previously configured OpenScale to use the free internal version of PostgreSQL, you can switch to a new datamart using a paid database service.__ If you would like to delete the internal PostgreSQL configuration and create a new one using service credentials supplied in the cell above, set the __KEEP_MY_INTERNAL_POSTGRES__ variable below to __False__ below. In this case, the notebook will remove your existing internal PostgreSQL datamart and create a new one with the supplied credentials. __*NO DATA MIGRATION WILL OCCUR.*__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "KEEP_MY_INTERNAL_POSTGRES = True" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run the notebook\n", "\n", "At this point, the notebook is ready to run. You can either run the cells one at a time, or click the **Kernel** option above and select **Restart and Run All** to run all the cells." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Model building and deployment " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section you will learn how to train Spark MLLib model and next deploy it as web-service using Watson Machine Learning service." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load the training data from github" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "from IPython.utils import io\n", "\n", "with io.capture_output() as captured:\n", " !wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/credit_risk/german_credit_data_biased_training.csv -O german_credit_data_biased_training.csv\n", "!ls -lh german_credit_data_biased_training.csv" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "from pyspark.sql import SparkSession\n", "import pandas as pd\n", "import json\n", "import datetime\n", "\n", "spark = SparkSession.builder.getOrCreate()\n", "pd_data = pd.read_csv(\"german_credit_data_biased_training.csv\", sep=\",\", header=0)\n", "df_data = spark.read.csv(path=\"german_credit_data_biased_training.csv\", sep=\",\", header=True, inferSchema=True)\n", "training_data_file_name = \"german_credit_data_biased_training.csv\"\n", "df_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Explore data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "df_data.printSchema()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "print(\"Number of records: \" + str(df_data.count()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Save training data to Cloud Object Storage" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import ibm_boto3\n", "from ibm_botocore.client import Config, ClientError\n", "\n", "cos_client = ibm_boto3.resource(\"s3\",\n", " ibm_api_key_id=COS_API_KEY_ID,\n", " ibm_service_instance_id=COS_RESOURCE_CRN,\n", " ibm_auth_endpoint=\"https://iam.bluemix.net/oidc/token\",\n", " config=Config(signature_version=\"oauth\"),\n", " endpoint_url=COS_ENDPOINT\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with open(training_data_file_name, \"rb\") as file_data:\n", " cos_client.Object(BUCKET_NAME, training_data_file_name).upload_fileobj(\n", " Fileobj=file_data\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create a model" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "spark_df = df_data\n", "(train_data, test_data) = spark_df.randomSplit([0.8, 0.2], 24)\n", "\n", "MODEL_NAME = \"Spark German Risk Model - Final\"\n", "DEPLOYMENT_NAME = \"Spark German Risk Deployment - Final\"\n", "\n", "print(\"Number of records for training: \" + str(train_data.count()))\n", "print(\"Number of records for evaluation: \" + str(test_data.count()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The code below creates a Random Forest Classifier with Spark, setting up string indexers for the categorical features and the label column. Finally, this notebook creates a pipeline including the indexers and the model, and does an initial Area Under ROC evaluation of the model." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "from pyspark.ml.feature import OneHotEncoder, StringIndexer, IndexToString, VectorAssembler\n", "from pyspark.ml.evaluation import BinaryClassificationEvaluator\n", "from pyspark.ml import Pipeline, Model\n", "from pyspark.ml.feature import SQLTransformer\n", "\n", "features = [x for x in spark_df.columns if x != 'Risk']\n", "categorical_features = ['CheckingStatus', 'CreditHistory', 'LoanPurpose', 'ExistingSavings', 'EmploymentDuration', 'Sex', 'OthersOnLoan', 'OwnsProperty', 'InstallmentPlans', 'Housing', 'Job', 'Telephone', 'ForeignWorker']\n", "categorical_num_features = [x + '_IX' for x in categorical_features]\n", "si_list = [StringIndexer(inputCol=x, outputCol=y) for x, y in zip(categorical_features, categorical_num_features)]\n", "va_features = VectorAssembler(inputCols=categorical_num_features + [x for x in features if x not in categorical_features], outputCol=\"features\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "si_label = StringIndexer(inputCol=\"Risk\", outputCol=\"label\").fit(spark_df)\n", "label_converter = IndexToString(inputCol=\"prediction\", outputCol=\"predictedLabel\", labels=si_label.labels)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "from pyspark.ml.classification import RandomForestClassifier\n", "\n", "classifier = RandomForestClassifier(featuresCol=\"features\")\n", "feature_filter = SQLTransformer(statement=\"SELECT * FROM __THIS__\")\n", "pipeline = Pipeline(stages= si_list + [si_label, va_features, classifier, label_converter, feature_filter])\n", "model = pipeline.fit(train_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note:** If you want filter features from model output please replace **`*`** with feature names to be retained in **`SQLTransformer`** statement." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "predictions = model.transform(test_data)\n", "evaluatorDT = BinaryClassificationEvaluator(rawPredictionCol=\"prediction\")\n", "area_under_curve = evaluatorDT.evaluate(predictions)\n", "\n", "print(\"areaUnderROC = %g\" % area_under_curve)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Publish the model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section, the notebook uses the supplied Watson Machine Learning credentials to save the model (including the pipeline) to the WML instance. Previous versions of the model are removed so that the notebook can be run again, resetting all data for another demo." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "import json\n", "from ibm_watson_machine_learning import APIClient\n", "\n", "wml_client = APIClient(WML_CREDENTIALS)\n", "wml_client.version" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Listing all the available spaces" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wml_client.spaces.list(limit=10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "WML_SPACE_ID='******' # use space id here\n", "wml_client.set.default_space(WML_SPACE_ID)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Remove existing model and deployment" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "deployments_list = wml_client.deployments.get_details()\n", "for deployment in deployments_list[\"resources\"]:\n", " model_id = deployment[\"entity\"][\"asset\"][\"id\"]\n", " deployment_id = deployment[\"metadata\"][\"id\"]\n", " if deployment[\"metadata\"][\"name\"] == DEPLOYMENT_NAME:\n", " print(\"Deleting deployment id\", deployment_id)\n", " wml_client.deployments.delete(deployment_id)\n", " print(\"Deleting model id\", model_id)\n", " wml_client.repository.delete(model_id)\n", "wml_client.repository.list_models()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "training_data_references = [\n", " {\n", " \"id\": \"product line\",\n", " \"type\": \"s3\",\n", " \"connection\": {\n", " \"access_key_id\": COS_API_KEY_ID,\n", " \"endpoint_url\": COS_ENDPOINT,\n", " \"resource_instance_id\":COS_RESOURCE_CRN\n", " },\n", " \"location\": {\n", " \"bucket\": BUCKET_NAME,\n", " \"path\": training_data_file_name,\n", " }\n", " }\n", " ]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "software_spec_uid = wml_client.software_specifications.get_id_by_name(\"spark-mllib_2.4\")\n", "print(\"Software Specification ID: {}\".format(software_spec_uid))\n", "model_props = {\n", " wml_client._models.ConfigurationMetaNames.NAME:\"{}\".format(MODEL_NAME),\n", " #wml_client._models.ConfigurationMetaNames.SPACE_UID: WML_SPACE_ID,\n", " wml_client._models.ConfigurationMetaNames.TYPE: \"mllib_2.4\",\n", " wml_client._models.ConfigurationMetaNames.SOFTWARE_SPEC_UID: software_spec_uid,\n", " wml_client._models.ConfigurationMetaNames.TRAINING_DATA_REFERENCES: training_data_references,\n", " wml_client._models.ConfigurationMetaNames.LABEL_FIELD: \"Risk\",\n", " }" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Storing model ...\")\n", "published_model_details = wml_client.repository.store_model(\n", " model=model, \n", " meta_props=model_props, \n", " training_data=train_data, \n", " pipeline=pipeline)\n", "\n", "model_uid = wml_client.repository.get_model_uid(published_model_details)\n", "print(\"Done\")\n", "print(\"Model ID: {}\".format(model_uid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deploy the model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next section of the notebook deploys the model as a RESTful web service in Watson Machine Learning. The deployed model will have a scoring URL you can use to send data to the model for predictions." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "deployment_details = wml_client.deployments.create(\n", " model_uid, \n", " meta_props={\n", " wml_client.deployments.ConfigurationMetaNames.NAME: \"{}\".format(DEPLOYMENT_NAME),\n", " wml_client.deployments.ConfigurationMetaNames.ONLINE: {}\n", " }\n", ")\n", "scoring_url = wml_client.deployments.get_scoring_href(deployment_details)\n", "deployment_uid=wml_client.deployments.get_uid(deployment_details)\n", "\n", "print(\"Scoring URL:\" + scoring_url)\n", "print(\"Model id: {}\".format(model_uid))\n", "print(\"Deployment id: {}\".format(deployment_uid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sample scoring" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fields = [\"CheckingStatus\", \"LoanDuration\", \"CreditHistory\", \"LoanPurpose\", \"LoanAmount\", \"ExistingSavings\",\n", " \"EmploymentDuration\", \"InstallmentPercent\", \"Sex\", \"OthersOnLoan\", \"CurrentResidenceDuration\",\n", " \"OwnsProperty\", \"Age\", \"InstallmentPlans\", \"Housing\", \"ExistingCreditsCount\", \"Job\", \"Dependents\",\n", " \"Telephone\", \"ForeignWorker\"]\n", "values = [\n", " [\"no_checking\", 13, \"credits_paid_to_date\", \"car_new\", 1343, \"100_to_500\", \"1_to_4\", 2, \"female\", \"none\", 3,\n", " \"savings_insurance\", 46, \"none\", \"own\", 2, \"skilled\", 1, \"none\", \"yes\"],\n", " [\"no_checking\", 24, \"prior_payments_delayed\", \"furniture\", 4567, \"500_to_1000\", \"1_to_4\", 4, \"male\", \"none\",\n", " 4, \"savings_insurance\", 36, \"none\", \"free\", 2, \"management_self-employed\", 1, \"none\", \"yes\"],\n", " ]\n", "\n", "scoring_payload = {\"input_data\": [{\"fields\": fields, \"values\": values}]}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "scoring_response = wml_client.deployments.score(deployment_uid, scoring_payload)\n", "scoring_response" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Configure OpenScale " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The notebook will now import the necessary libraries and set up a Python OpenScale client." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "from ibm_cloud_sdk_core.authenticators import IAMAuthenticator,BearerTokenAuthenticator\n", "\n", "from ibm_watson_openscale import *\n", "from ibm_watson_openscale.supporting_classes.enums import *\n", "from ibm_watson_openscale.supporting_classes import *\n", "\n", "\n", "authenticator = IAMAuthenticator(apikey=CLOUD_API_KEY)\n", "#authenticator = BearerTokenAuthenticator(bearer_token=IAM_TOKEN) ## uncomment this line if using IAM token to authenticate\n", "wos_client = APIClient(authenticator=authenticator)\n", "wos_client.version" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create schema and datamart" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set up datamart" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Watson OpenScale uses a database to store payload logs and calculated metrics. If database credentials were **not** supplied above, the notebook will use the free, internal lite database. If database credentials were supplied, the datamart will be created there **unless** there is an existing datamart **and** the **KEEP_MY_INTERNAL_POSTGRES** variable is set to **True**. If an OpenScale datamart exists in Db2 or PostgreSQL, the existing datamart will be used and no data will be overwritten.\n", "\n", "Prior instances of the German Credit model will be removed from OpenScale monitoring." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.data_marts.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data_marts = wos_client.data_marts.list().result.data_marts\n", "if len(data_marts) == 0:\n", " if DB_CREDENTIALS is not None:\n", " if SCHEMA_NAME is None: \n", " print(\"Please specify the SCHEMA_NAME and rerun the cell\")\n", "\n", " print('Setting up external datamart')\n", " added_data_mart_result = wos_client.data_marts.add(\n", " background_mode=False,\n", " name=\"WOS Data Mart\",\n", " description=\"Data Mart created by WOS tutorial notebook\",\n", " database_configuration=DatabaseConfigurationRequest(\n", " database_type=DatabaseType.POSTGRESQL,\n", " credentials=PrimaryStorageCredentialsLong(\n", " hostname=DB_CREDENTIALS['hostname'],\n", " username=DB_CREDENTIALS['username'],\n", " password=DB_CREDENTIALS['password'],\n", " db=DB_CREDENTIALS['database'],\n", " port=DB_CREDENTIALS['port'],\n", " ssl=True,\n", " sslmode=DB_CREDENTIALS['sslmode'],\n", " certificate_base64=DB_CREDENTIALS['certificate_base64']\n", " ),\n", " location=LocationSchemaName(\n", " schema_name= SCHEMA_NAME\n", " )\n", " )\n", " ).result\n", " else:\n", " print('Setting up internal datamart')\n", " added_data_mart_result = wos_client.data_marts.add(\n", " background_mode=False,\n", " name=\"WOS Data Mart\",\n", " description=\"Data Mart created by WOS tutorial notebook\", \n", " internal_database = True).result\n", " \n", " data_mart_id = added_data_mart_result.metadata.id\n", " \n", "else:\n", " data_mart_id=data_marts[0].metadata.id\n", " print('Using existing datamart {}'.format(data_mart_id))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Remove existing service provider connected with used WML instance. " ] }, { "cell_type": "markdown", "metadata": { "scrolled": true }, "source": [ "Multiple service providers for the same engine instance are avaiable in Watson OpenScale. To avoid multiple service providers of used WML instance in the tutorial notebook the following code deletes existing service provder(s) and then adds new one. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SERVICE_PROVIDER_NAME = \"Watson Machine Learning V2\"\n", "SERVICE_PROVIDER_DESCRIPTION = \"Added by tutorial WOS notebook.\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "service_providers = wos_client.service_providers.list().result.service_providers\n", "for service_provider in service_providers:\n", " service_instance_name = service_provider.entity.name\n", " if service_instance_name == SERVICE_PROVIDER_NAME:\n", " service_provider_id = service_provider.metadata.id\n", " wos_client.service_providers.delete(service_provider_id)\n", " print(\"Deleted existing service_provider for WML instance: {}\".format(service_provider_id))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Add service provider" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Watson OpenScale needs to be bound to the Watson Machine Learning instance to capture payload data into and out of the model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note:** You can bind more than one engine instance if needed by calling `wos_client.service_providers.add` method. Next, you can refer to particular service provider using `service_provider_id`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "added_service_provider_result = wos_client.service_providers.add(\n", " name=SERVICE_PROVIDER_NAME,\n", " description=SERVICE_PROVIDER_DESCRIPTION,\n", " service_type=ServiceTypes.WATSON_MACHINE_LEARNING,\n", " deployment_space_id = WML_SPACE_ID,\n", " operational_space_id = \"production\",\n", " credentials=WMLCredentialsCloud(\n", " apikey=CLOUD_API_KEY, ## use `apikey=IAM_TOKEN` if using IAM_TOKEN to initiate client\n", " url=WML_CREDENTIALS[\"url\"],\n", " instance_id=None\n", " ),\n", " background_mode=False\n", " ).result\n", "service_provider_id = added_service_provider_result.metadata.id" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.service_providers.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "asset_deployment_details = wos_client.service_providers.list_assets(data_mart_id=data_mart_id, service_provider_id=service_provider_id, deployment_space_id = WML_SPACE_ID).result['resources'][0]\n", "asset_deployment_details" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_asset_details_from_deployment=wos_client.service_providers.get_deployment_asset(data_mart_id=data_mart_id,service_provider_id=service_provider_id,deployment_id=deployment_uid,deployment_space_id=WML_SPACE_ID)\n", "model_asset_details_from_deployment" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Subscriptions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Remove existing credit risk subscriptions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This code removes previous subscriptions to the German Credit model to refresh the monitors with the new model and new data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.subscriptions.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This code removes previous subscriptions to the German Credit model to refresh the monitors with the new model and new data." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "subscriptions = wos_client.subscriptions.list().result.subscriptions\n", "for subscription in subscriptions:\n", " sub_model_id = subscription.entity.asset.asset_id\n", " if sub_model_id == model_uid:\n", " wos_client.subscriptions.delete(subscription.metadata.id)\n", " print('Deleted existing subscription for model', sub_model_id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This code creates the model subscription in OpenScale using the Python client API. Note that we need to provide the model unique identifier, and some information about the model itself." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### This code creates the model subscription in OpenScale using the Python client API. Note that we need to provide the model unique identifier, and some information about the model itself." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from ibm_watson_openscale.base_classes.watson_open_scale_v2 import ScoringEndpointRequest" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "subscription_details = wos_client.subscriptions.add(\n", " data_mart_id=data_mart_id,\n", " service_provider_id=service_provider_id,\n", " asset=Asset(\n", " asset_id=model_asset_details_from_deployment[\"entity\"][\"asset\"][\"asset_id\"],\n", " name=model_asset_details_from_deployment[\"entity\"][\"asset\"][\"name\"],\n", " url=model_asset_details_from_deployment[\"entity\"][\"asset\"][\"url\"],\n", " asset_type=AssetTypes.MODEL,\n", " input_data_type=InputDataType.STRUCTURED,\n", " problem_type=ProblemType.BINARY_CLASSIFICATION\n", " ),\n", " deployment=AssetDeploymentRequest(\n", " deployment_id=asset_deployment_details['metadata']['guid'],\n", " name=asset_deployment_details['entity']['name'],\n", " deployment_type= DeploymentTypes.ONLINE,\n", " url=asset_deployment_details['metadata']['url'],\n", " scoring_endpoint=ScoringEndpointRequest(url=scoring_url) # score model without shadow deployment\n", " ),\n", " asset_properties=AssetPropertiesRequest(\n", " label_column='Risk',\n", " probability_fields=['probability'],\n", " prediction_field='predictedLabel',\n", " feature_fields = [\"CheckingStatus\",\"LoanDuration\",\"CreditHistory\",\"LoanPurpose\",\"LoanAmount\",\"ExistingSavings\",\"EmploymentDuration\",\"InstallmentPercent\",\"Sex\",\"OthersOnLoan\",\"CurrentResidenceDuration\",\"OwnsProperty\",\"Age\",\"InstallmentPlans\",\"Housing\",\"ExistingCreditsCount\",\"Job\",\"Dependents\",\"Telephone\",\"ForeignWorker\"],\n", " categorical_fields = [\"CheckingStatus\",\"CreditHistory\",\"LoanPurpose\",\"ExistingSavings\",\"EmploymentDuration\",\"Sex\",\"OthersOnLoan\",\"OwnsProperty\",\"InstallmentPlans\",\"Housing\",\"Job\",\"Telephone\",\"ForeignWorker\"],\n", " training_data_reference=TrainingDataReference(type='cos',\n", " location=COSTrainingDataReferenceLocation(bucket = BUCKET_NAME,\n", " file_name = training_data_file_name),\n", " connection=COSTrainingDataReferenceConnection.from_dict({\n", " \"resource_instance_id\": COS_RESOURCE_CRN,\n", " \"url\": COS_ENDPOINT,\n", " \"api_key\": COS_API_KEY_ID,\n", " \"iam_url\": IAM_URL})),\n", " training_data_schema=SparkStruct.from_dict(model_asset_details_from_deployment[\"entity\"][\"asset_properties\"][\"training_data_schema\"])\n", " )\n", " ).result\n", "subscription_id = subscription_details.metadata.id\n", "subscription_id" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import time\n", "\n", "time.sleep(5)\n", "payload_data_set_id = None\n", "payload_data_set_id = wos_client.data_sets.list(type=DataSetTypes.PAYLOAD_LOGGING, \n", " target_target_id=subscription_id, \n", " target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id\n", "if payload_data_set_id is None:\n", " print(\"Payload data set not found. Please check subscription status.\")\n", "else:\n", " print(\"Payload data set id: \", payload_data_set_id)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.data_sets.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get subscription list" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "wos_client.subscriptions.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Score the model so we can configure monitors" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that the WML service has been bound and the subscription has been created, we need to send a request to the model before we configure OpenScale. This allows OpenScale to create a payload log in the datamart with the correct schema, so it can capture data coming into and out of the model. First, the code gets the model deployment's endpoint URL, and then sends a few records for predictions." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "fields = [\"CheckingStatus\",\"LoanDuration\",\"CreditHistory\",\"LoanPurpose\",\"LoanAmount\",\"ExistingSavings\",\"EmploymentDuration\",\"InstallmentPercent\",\"Sex\",\"OthersOnLoan\",\"CurrentResidenceDuration\",\"OwnsProperty\",\"Age\",\"InstallmentPlans\",\"Housing\",\"ExistingCreditsCount\",\"Job\",\"Dependents\",\"Telephone\",\"ForeignWorker\"]\n", "values = [\n", " [\"no_checking\",13,\"credits_paid_to_date\",\"car_new\",1343,\"100_to_500\",\"1_to_4\",2,\"female\",\"none\",3,\"savings_insurance\",46,\"none\",\"own\",2,\"skilled\",1,\"none\",\"yes\"],\n", " [\"no_checking\",24,\"prior_payments_delayed\",\"furniture\",4567,\"500_to_1000\",\"1_to_4\",4,\"male\",\"none\",4,\"savings_insurance\",36,\"none\",\"free\",2,\"management_self-employed\",1,\"none\",\"yes\"],\n", " [\"0_to_200\",26,\"all_credits_paid_back\",\"car_new\",863,\"less_100\",\"less_1\",2,\"female\",\"co-applicant\",2,\"real_estate\",38,\"none\",\"own\",1,\"skilled\",1,\"none\",\"yes\"],\n", " [\"0_to_200\",14,\"no_credits\",\"car_new\",2368,\"less_100\",\"1_to_4\",3,\"female\",\"none\",3,\"real_estate\",29,\"none\",\"own\",1,\"skilled\",1,\"none\",\"yes\"],\n", " [\"0_to_200\",4,\"no_credits\",\"car_new\",250,\"less_100\",\"unemployed\",2,\"female\",\"none\",3,\"real_estate\",23,\"none\",\"rent\",1,\"management_self-employed\",1,\"none\",\"yes\"],\n", " [\"no_checking\",17,\"credits_paid_to_date\",\"car_new\",832,\"100_to_500\",\"1_to_4\",2,\"male\",\"none\",2,\"real_estate\",42,\"none\",\"own\",1,\"skilled\",1,\"none\",\"yes\"],\n", " [\"no_checking\",33,\"outstanding_credit\",\"appliances\",5696,\"unknown\",\"greater_7\",4,\"male\",\"co-applicant\",4,\"unknown\",54,\"none\",\"free\",2,\"skilled\",1,\"yes\",\"yes\"],\n", " [\"0_to_200\",13,\"prior_payments_delayed\",\"retraining\",1375,\"100_to_500\",\"4_to_7\",3,\"male\",\"none\",3,\"real_estate\",37,\"none\",\"own\",2,\"management_self-employed\",1,\"none\",\"yes\"]\n", "]\n", "\n", "payload_scoring = {\"input_data\": [{\"fields\": fields, \"values\": values}]}\n", "predictions = wml_client.deployments.score(deployment_uid, payload_scoring)\n", "\n", "print(\"Single record scoring result:\", \"\\n fields:\", predictions[\"predictions\"][0][\"fields\"], \"\\n values: \", predictions[\"predictions\"][0][\"values\"][0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Check if WML payload logging worked else manually store payload records" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import uuid\n", "from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord\n", "time.sleep(5)\n", "pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)\n", "print(\"Number of records in the payload logging table: {}\".format(pl_records_count))\n", "if pl_records_count == 0:\n", " print(\"Payload logging did not happen, performing explicit payload logging.\")\n", " wos_client.data_sets.store_records(data_set_id=payload_data_set_id, request_body=[PayloadRecord(\n", " scoring_id=str(uuid.uuid4()),\n", " request=payload_scoring,\n", " response={\"fields\": predictions['predictions'][0]['fields'], \"values\":predictions['predictions'][0]['values']},\n", " response_time=460\n", " )])\n", " time.sleep(5)\n", " pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)\n", " print(\"Number of records in the payload logging table: {}\".format(pl_records_count))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Quality monitoring and feedback logging " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Enable quality monitoring" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The code below waits ten seconds to allow the payload logging table to be set up before it begins enabling monitors. First, it turns on the quality (accuracy) monitor and sets an alert threshold of 70%. OpenScale will show an alert on the dashboard if the model accuracy measurement (area under the curve, in the case of a binary classifier) falls below this threshold.\n", "\n", "The second paramater supplied, min_records, specifies the minimum number of feedback records OpenScale needs before it calculates a new measurement. The quality monitor runs hourly, but the accuracy reading in the dashboard will not change until an additional 50 feedback records have been added, via the user interface, the Python client, or the supplied feedback endpoint." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "import time\n", "\n", "time.sleep(10)\n", "target = Target(\n", " target_type=TargetTypes.SUBSCRIPTION,\n", " target_id=subscription_id\n", ")\n", "parameters = {\n", " \"min_feedback_data_size\": 50\n", "}\n", "thresholds = [\n", " {\n", " \"metric_id\": \"area_under_roc\",\n", " \"type\": \"lower_limit\",\n", " \"value\": .80\n", " }\n", " ]\n", "quality_monitor_details = wos_client.monitor_instances.create(\n", " data_mart_id=data_mart_id,\n", " background_mode=False,\n", " monitor_definition_id=wos_client.monitor_definitions.MONITORS.QUALITY.ID,\n", " target=target,\n", " parameters=parameters,\n", " thresholds=thresholds\n", ").result" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "quality_monitor_instance_id = quality_monitor_details.metadata.id\n", "quality_monitor_instance_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Feedback logging" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The code below downloads and stores enough feedback data to meet the minimum threshold so that OpenScale can calculate a new accuracy measurement. It then kicks off the accuracy monitor. The monitors run hourly, or can be initiated via the Python API, the REST API, or the graphical user interface." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!rm additional_feedback_data_v2.json\n", "!wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/credit_risk/additional_feedback_data_v2.json" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get feedback logging dataset ID" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "feedback_dataset_id = None\n", "feedback_dataset = wos_client.data_sets.list(type=DataSetTypes.FEEDBACK, \n", " target_target_id=subscription_id, \n", " target_target_type=TargetTypes.SUBSCRIPTION).result\n", "print(feedback_dataset)\n", "feedback_dataset_id = feedback_dataset.data_sets[0].metadata.id\n", "if feedback_dataset_id is None:\n", " print(\"Feedback data set not found. Please check quality monitor status.\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with open('additional_feedback_data_v2.json') as feedback_file:\n", " additional_feedback_data = json.load(feedback_file)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.data_sets.store_records(feedback_dataset_id, request_body=additional_feedback_data, background_mode=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "wos_client.data_sets.get_records_count(data_set_id=feedback_dataset_id)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "run_details = wos_client.monitor_instances.run(monitor_instance_id=quality_monitor_instance_id, background_mode=False).result" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "wos_client.monitor_instances.show_metrics(monitor_instance_id=quality_monitor_instance_id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Fairness, drift monitoring and explanations " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Fairness configuration\n", "\n", "The code below configures fairness monitoring for our model. It turns on monitoring for two features, Sex and Age. In each case, we must specify:\n", "\n", " * Which model feature to monitor\n", " * One or more **majority** groups, which are values of that feature that we expect to receive a higher percentage of favorable outcomes\n", " * One or more **minority** groups, which are values of that feature that we expect to receive a higher percentage of unfavorable outcomes\n", " * The threshold at which we would like OpenScale to display an alert if the fairness measurement falls below (in this case, 95%)\n", "\n", "Additionally, we must specify which outcomes from the model are favourable outcomes, and which are unfavourable. We must also provide the number of records OpenScale will use to calculate the fairness score. In this case, OpenScale's fairness monitor will run hourly, but will not calculate a new fairness rating until at least 200 records have been added. Finally, to calculate fairness, OpenScale must perform some calculations on the training data, so we provide the dataframe containing the data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.monitor_instances.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "target = Target(\n", " target_type=TargetTypes.SUBSCRIPTION,\n", " target_id=subscription_id\n", "\n", ")\n", "parameters = {\n", " \"features\": [\n", " {\"feature\": \"Sex\",\n", " \"majority\": ['male'],\n", " \"minority\": ['female'],\n", " \"threshold\": 0.95\n", " },\n", " {\"feature\": \"Age\",\n", " \"majority\": [[26, 75]],\n", " \"minority\": [[18, 25]],\n", " \"threshold\": 0.95\n", " }\n", " ],\n", " \"favourable_class\": [\"No Risk\"],\n", " \"unfavourable_class\": [\"Risk\"],\n", " \"min_records\": 100\n", "}\n", "\n", "fairness_monitor_details = wos_client.monitor_instances.create(\n", " data_mart_id=data_mart_id,\n", " background_mode=False,\n", " monitor_definition_id=wos_client.monitor_definitions.MONITORS.FAIRNESS.ID,\n", " target=target,\n", " parameters=parameters).result\n", "fairness_monitor_instance_id =fairness_monitor_details.metadata.id\n", "fairness_monitor_instance_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Drift configuration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Note: you can choose to enable/disable (True or False) model or data drift within config" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "monitor_instances = wos_client.monitor_instances.list().result.monitor_instances\n", "for monitor_instance in monitor_instances:\n", " monitor_def_id=monitor_instance.entity.monitor_definition_id\n", " if monitor_def_id == \"drift\" and monitor_instance.entity.target.target_id == subscription_id:\n", " wos_client.monitor_instances.delete(monitor_instance.metadata.id)\n", " print('Deleted existing drift monitor instance with id: ', monitor_instance.metadata.id)\n", "\n", "\n", "target = Target(\n", " target_type=TargetTypes.SUBSCRIPTION,\n", " target_id=subscription_id\n", "\n", ")\n", "parameters = {\n", " \"min_samples\": 100,\n", " \"drift_threshold\": 0.1,\n", " \"train_drift_model\": True,\n", " \"enable_model_drift\": False,\n", " \"enable_data_drift\": True\n", "}\n", "\n", "drift_monitor_details = wos_client.monitor_instances.create(\n", " data_mart_id=data_mart_id,\n", " background_mode=False,\n", " monitor_definition_id=wos_client.monitor_definitions.MONITORS.DRIFT.ID,\n", " target=target,\n", " parameters=parameters\n", ").result\n", "\n", "drift_monitor_instance_id = drift_monitor_details.metadata.id\n", "drift_monitor_instance_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Score the model again now that monitoring is configured" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This next section randomly selects 200 records from the data feed and sends those records to the model for predictions. This is enough to exceed the minimum threshold for records set in the previous section, which allows OpenScale to begin calculating fairness." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with io.capture_output() as captured:\n", " !wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/credit_risk/german_credit_feed.json\n", "!ls -lh german_credit_feed.json" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Score 200 randomly chosen records" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import random\n", "\n", "with open('german_credit_feed.json', 'r') as scoring_file:\n", " scoring_data = json.load(scoring_file)\n", "\n", "fields = scoring_data['fields']\n", "values = []\n", "for _ in range(200):\n", " values.append(random.choice(scoring_data['values']))\n", "payload_scoring = {\"input_data\": [{\"fields\": fields, \"values\": values}]}\n", "\n", "scoring_response = wml_client.deployments.score(deployment_uid, payload_scoring)\n", "\n", "time.sleep(5)\n", "pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)\n", "\n", "if pl_records_count == 8:\n", " print(\"Payload logging did not happen, performing explicit payload logging.\")\n", " wos_client.data_sets.store_records(data_set_id=payload_data_set_id, request_body=[PayloadRecord(\n", " scoring_id=str(uuid.uuid4()),\n", " request=payload_scoring,\n", " response={\"fields\": scoring_response['predictions'][0]['fields'], \"values\":scoring_response['predictions'][0]['values']},\n", " response_time=460\n", " )])\n", " time.sleep(5)\n", " pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)\n", " print(\"Number of records in the payload logging table: {}\".format(pl_records_count))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note:** Now in payload table should be total 208 records." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Number of records in payload table: ', wos_client.data_sets.get_records_count(data_set_id=payload_data_set_id))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run fairness monitor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Kick off a fairness monitor run on current data. The monitor runs hourly, but can be manually initiated using the Python client, the REST API, or the graphical user interface." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "run_details = wos_client.monitor_instances.run(monitor_instance_id=fairness_monitor_instance_id, background_mode=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "time.sleep(10)\n", "\n", "wos_client.monitor_instances.show_metrics(monitor_instance_id=fairness_monitor_instance_id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run drift monitor\n", "\n", "\n", "Kick off a drift monitor run on current data. The monitor runs every hour, but can be manually initiated using the Python client, the REST API." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "drift_run_details = wos_client.monitor_instances.run(monitor_instance_id=drift_monitor_instance_id, background_mode=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "time.sleep(5)\n", "\n", "wos_client.monitor_instances.show_metrics(monitor_instance_id=drift_monitor_instance_id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Configure Explainability" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we provide OpenScale with the training data to enable and configure the explainability features." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "target = Target(\n", " target_type=TargetTypes.SUBSCRIPTION,\n", " target_id=subscription_id\n", ")\n", "parameters = {\n", " \"enabled\": True\n", "}\n", "explainability_details = wos_client.monitor_instances.create(\n", " data_mart_id=data_mart_id,\n", " background_mode=False,\n", " monitor_definition_id=wos_client.monitor_definitions.MONITORS.EXPLAINABILITY.ID,\n", " target=target,\n", " parameters=parameters\n", ").result\n", "\n", "explainability_monitor_id = explainability_details.metadata.id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run explanation for sample record" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pl_records_resp = wos_client.data_sets.get_list_of_records(data_set_id=payload_data_set_id, limit=1, offset=0).result\n", "scoring_ids = [pl_records_resp[\"records\"][0][\"entity\"][\"values\"][\"scoring_id\"]]\n", "print(\"Running explanations on scoring IDs: {}\".format(scoring_ids))\n", "explanation_types = [\"lime\", \"contrastive\"]\n", "result = wos_client.monitor_instances.explanation_tasks(scoring_ids=scoring_ids, explanation_types=explanation_types).result\n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Custom monitors and metrics " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Register custom monitor" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_definition(monitor_name):\n", " monitor_definitions = wos_client.monitor_definitions.list().result.monitor_definitions\n", " \n", " for definition in monitor_definitions:\n", " if monitor_name == definition.entity.name:\n", " return definition\n", " \n", " return None" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "monitor_name = 'my model performance'\n", "metrics = [MonitorMetricRequest(name='sensitivity',\n", " thresholds=[MetricThreshold(type=MetricThresholdTypes.LOWER_LIMIT, default=0.8)]),\n", " MonitorMetricRequest(name='specificity',\n", " thresholds=[MetricThreshold(type=MetricThresholdTypes.LOWER_LIMIT, default=0.75)])]\n", "tags = [MonitorTagRequest(name='region', description='customer geographical region')]\n", "\n", "existing_definition = get_definition(monitor_name)\n", "\n", "if existing_definition is None:\n", " custom_monitor_details = wos_client.monitor_definitions.add(name=monitor_name, metrics=metrics, tags=tags, background_mode=False).result\n", "else:\n", " custom_monitor_details = existing_definition" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Show available monitors types" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.monitor_definitions.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get monitors uids and details" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "custom_monitor_id = custom_monitor_details.metadata.id\n", "\n", "print(custom_monitor_id)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "custom_monitor_details = wos_client.monitor_definitions.get(monitor_definition_id=custom_monitor_id).result\n", "print('Monitor definition details:', custom_monitor_details)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Enable custom monitor for subscription" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "target = Target(\n", " target_type=TargetTypes.SUBSCRIPTION,\n", " target_id=subscription_id\n", " )\n", "\n", "thresholds = [MetricThresholdOverride(metric_id='sensitivity', type = MetricThresholdTypes.LOWER_LIMIT, value=0.9)]\n", "\n", "custom_monitor_instance_details = wos_client.monitor_instances.create(\n", " data_mart_id=data_mart_id,\n", " background_mode=False,\n", " monitor_definition_id=custom_monitor_id,\n", " target=target\n", ").result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get monitor instance id and configuration details" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "custom_monitor_instance_id = custom_monitor_instance_details.metadata.id" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "custom_monitor_instance_details = wos_client.monitor_instances.get(custom_monitor_instance_id).result\n", "print(custom_monitor_instance_details)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Storing custom metrics" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from datetime import datetime, timezone, timedelta\n", "from ibm_watson_openscale.base_classes.watson_open_scale_v2 import MonitorMeasurementRequest\n", "custom_monitoring_run_id = \"11122223333111abc\"\n", "measurement_request = [MonitorMeasurementRequest(timestamp=datetime.now(timezone.utc), \n", " metrics=[{\"specificity\": 0.78, \"sensitivity\": 0.67, \"region\": \"us-south\"}], run_id=custom_monitoring_run_id)]\n", "print(measurement_request[0])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "published_measurement_response = wos_client.monitor_instances.measurements.add(\n", " monitor_instance_id=custom_monitor_instance_id,\n", " monitor_measurement_request=measurement_request).result\n", "published_measurement_id = published_measurement_response[0][\"measurement_id\"]\n", "print(published_measurement_response)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### List and get custom metrics" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "time.sleep(5)\n", "published_measurement = wos_client.monitor_instances.measurements.get(monitor_instance_id=custom_monitor_instance_id, measurement_id=published_measurement_id).result\n", "print(published_measurement)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Historical data " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "historyDays = 7" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Insert historical fairness metrics" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!rm history_fairness_v2.json\n", "with io.capture_output() as captured:\n", " !wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/historical_data/credit_risk/history_fairness_v2.json\n", "!ls -lh history_fairness_v2.json" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from datetime import datetime, timedelta, timezone\n", "\n", "with open('history_fairness_v2.json', 'r') as history_file:\n", " payloads = json.load(history_file)\n", "\n", "for day in range(historyDays):\n", " print('Loading day', day + 1)\n", " daily_measurement_requests = []\n", " \n", " for hour in range(24):\n", " score_time = datetime.now(timezone.utc) + timedelta(hours=(-(24*day + hour + 1)))\n", " index = (day * 24 + hour) % len(payloads) # wrap around and reuse values if needed\n", " \n", " measurement_request = MonitorMeasurementRequest(timestamp=score_time,metrics = [payloads[index][0], payloads[index][1]])\n", " daily_measurement_requests.append(measurement_request)\n", " \n", " \n", " response = wos_client.monitor_instances.measurements.add(\n", " monitor_instance_id=fairness_monitor_instance_id,\n", " monitor_measurement_request=daily_measurement_requests).result \n", "print('Finished')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Insert historical debias metrics" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!rm history_debias_v2.json\n", "with io.capture_output() as captured:\n", " !wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/historical_data/credit_risk/history_debias_v2.json\n", "!ls -lh history_debias_v2.json" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with open('history_debias_v2.json', 'r') as history_file:\n", " payloads = json.load(history_file)\n", "\n", "for day in range(historyDays):\n", " print('Loading day', day + 1)\n", " daily_measurement_requests = []\n", " for hour in range(24):\n", " score_time = datetime.now(timezone.utc) + timedelta(hours=(-(24*day + hour + 1)))\n", " index = (day * 24 + hour) % len(payloads) # wrap around and reuse values if needed\n", "\n", " measurement_request = MonitorMeasurementRequest(timestamp=score_time,metrics = [payloads[index][0], payloads[index][1]])\n", " \n", " daily_measurement_requests.append(measurement_request)\n", " \n", " response = wos_client.monitor_instances.measurements.add(\n", " monitor_instance_id=fairness_monitor_instance_id,\n", " monitor_measurement_request=daily_measurement_requests).result \n", "\n", "print('Finished')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Insert historical quality metrics" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "measurements = [0.76, 0.78, 0.68, 0.72, 0.73, 0.77, 0.80]\n", "for day in range(historyDays):\n", " quality_measurement_requests = []\n", " print('Loading day', day + 1)\n", " for hour in range(24):\n", " score_time = datetime.utcnow() + timedelta(hours=(-(24*day + hour + 1)))\n", " score_time = score_time.isoformat() + \"Z\"\n", " \n", " metric = {\"area_under_roc\": measurements[day]}\n", " \n", " measurement_request = MonitorMeasurementRequest(timestamp=score_time,metrics = [metric])\n", " quality_measurement_requests.append(measurement_request)\n", " \n", " \n", " response = wos_client.monitor_instances.measurements.add(\n", " monitor_instance_id=quality_monitor_instance_id,\n", " monitor_measurement_request=quality_measurement_requests).result \n", " \n", "print('Finished')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Insert historical confusion matrixes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!rm history_quality_metrics.json\n", "with io.capture_output() as captured:\n", " !wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/historical_data/credit_risk/history_quality_metrics.json\n", "!ls -lh history_quality_metrics.json" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from ibm_watson_openscale.base_classes.watson_open_scale_v2 import Source\n", "\n", "with open('history_quality_metrics.json') as json_file:\n", " records = json.load(json_file)\n", " \n", "for day in range(historyDays):\n", " index = 0\n", " cm_measurement_requests = []\n", " print('Loading day', day + 1)\n", " \n", " for hour in range(24):\n", " score_time = datetime.utcnow() + timedelta(hours=(-(24*day + hour + 1)))\n", " score_time = score_time.isoformat() + \"Z\"\n", "\n", " metric = records[index]['metrics']\n", " source = records[index]['sources']\n", "\n", " \n", " measurement_request = {\"timestamp\": score_time, \"metrics\": [metric], \"sources\": [source]}\n", " cm_measurement_requests.append(measurement_request)\n", "\n", " index+=1\n", "\n", " response = wos_client.monitor_instances.measurements.add(monitor_instance_id=quality_monitor_instance_id, monitor_measurement_request=cm_measurement_requests).result \n", "\n", "print('Finished')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Insert historical performance metrics" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "target = Target(\n", " target_type=TargetTypes.INSTANCE,\n", " target_id=payload_data_set_id\n", " )\n", "\n", "\n", "performance_monitor_instance_details = wos_client.monitor_instances.create(\n", " data_mart_id=data_mart_id,\n", " background_mode=False,\n", " monitor_definition_id=wos_client.monitor_definitions.MONITORS.PERFORMANCE.ID,\n", " target=target\n", ").result\n", "performance_monitor_instance_id = performance_monitor_instance_details.metadata.id\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for day in range(historyDays):\n", " performance_measurement_requests = []\n", " print('Loading day', day + 1)\n", " for hour in range(24):\n", " score_time = datetime.utcnow() + timedelta(hours=(-(24*day + hour + 1)))\n", " score_time = score_time.isoformat() + \"Z\"\n", " score_count = random.randint(60, 600)\n", " \n", " metric = {\"record_count\": score_count, \"data_set_type\": \"scoring_payload\"}\n", " \n", " measurement_request = {\"timestamp\": score_time, \"metrics\": [metric]}\n", " performance_measurement_requests.append(measurement_request)\n", " \n", " response = wos_client.monitor_instances.measurements.add(\n", " monitor_instance_id=performance_monitor_instance_id,\n", " monitor_measurement_request=performance_measurement_requests).result \n", "\n", "print('Finished')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Insert historical drift measurements" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with io.capture_output() as captured:\n", " !wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/historical_data/credit_risk/history_drift_measurement_0.json\n", " !wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/historical_data/credit_risk/history_drift_measurement_1.json\n", " !wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/historical_data/credit_risk/history_drift_measurement_2.json\n", " !wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/historical_data/credit_risk/history_drift_measurement_3.json\n", " !wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/historical_data/credit_risk/history_drift_measurement_4.json\n", " !wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/historical_data/credit_risk/history_drift_measurement_5.json\n", " !wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/historical_data/credit_risk/history_drift_measurement_6.json\n", "!ls -lh history_drift_measurement_*.json" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for day in range(historyDays):\n", " drift_measurements = []\n", "\n", " with open(\"history_drift_measurement_{}.json\".format(day), 'r') as history_file:\n", " drift_daily_measurements = json.load(history_file)\n", " print('Loading day', day + 1)\n", "\n", " #Historical data contains 8 records per day - each represents 3 hour drift window.\n", " \n", " for nb_window, records in enumerate(drift_daily_measurements):\n", " for record in records:\n", " window_start = datetime.utcnow() + timedelta(hours=(-(24 * day + (nb_window+1)*3 + 1))) # first_payload_record_timestamp_in_window (oldest)\n", " window_end = datetime.utcnow() + timedelta(hours=(-(24 * day + nb_window*3 + 1)))# last_payload_record_timestamp_in_window (most recent)\n", " #modify start and end time for each record\n", " record['sources'][0]['data']['start'] = window_start.isoformat() + \"Z\"\n", " record['sources'][0]['data']['end'] = window_end.isoformat() + \"Z\"\n", " \n", " \n", " metric = record['metrics'][0]\n", " source = record['sources'][0]\n", "\n", " measurement_request = {\"timestamp\": window_start.isoformat() + \"Z\", \"metrics\": [metric], \"sources\": [source]}\n", " \n", " drift_measurements.append(measurement_request)\n", " \n", " response = wos_client.monitor_instances.measurements.add(\n", " monitor_instance_id=drift_monitor_instance_id,\n", " monitor_measurement_request=drift_measurements).result \n", "\n", " \n", " print(\"Daily loading finished.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Additional data to help debugging" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "print('Datamart:', data_mart_id)\n", "print('Model:', model_uid)\n", "print('Deployment:', deployment_uid)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Identify transactions for Explainability" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Transaction IDs identified by the cells below can be copied and pasted into the Explainability tab of the OpenScale dashboard." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "wos_client.data_sets.show_records(payload_data_set_id, limit=5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Congratulations!\n", "\n", "You have finished the hands-on lab for IBM Watson OpenScale. You can now view the [OpenScale Dashboard](https://aiopenscale.cloud.ibm.com/). Click on the tile for the German Credit model to see fairness, accuracy, and performance monitors. Click on the timeseries graph to get detailed information on transactions during a specific time window.\n", "\n", "## Next steps\n", "\n", "OpenScale shows model performance over time. You have two options to keep data flowing to your OpenScale graphs:\n", " * Download, configure and schedule the [model feed notebook](https://raw.githubusercontent.com/emartensibm/german-credit/master/german_credit_scoring_feed.ipynb). This notebook can be set up with your WML credentials, and scheduled to provide a consistent flow of scoring requests to your model, which will appear in your OpenScale monitors.\n", " * Re-run this notebook. Running this notebook from the beginning will delete and re-create the model and deployment, and re-create the historical data. Please note that the payload and measurement logs for the previous deployment will continue to be stored in your datamart, and can be deleted if necessary." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.7", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 1 }