Information for: DEVELOPERS   PARTNERS   SUPPORT

ML Studio Developer Documentation for ML Client

Client

The term “Client” represents the Acquia ML Studio service.

Project

delete_project()

Deletes all the project metadata initialized with the client. However, this method deletes metadata only if the linked output data and schedules are already deleted.

Note

This method deletes all the project metadata and frees up one of the CDP ML Studio tables to be used by another project. Ensure that you use this method only when appropriate.

Request syntax

response = client.delete_project()

Parameters

No parameters need to be passed to this method.

Return type

Returns nothing

Exceptions/errors

  • ProjectOutputExistsException: Occurs if there is existing output data linked to the project. Delete the data before deleting the project.
  • ProjectScheduleExistsException: Occurs if there are existing schedules linked to the project. Delete the schedules before deleting the project.

describe_project()

Lists information about the project linked with the client.

Request syntax

response = client.describe_project()

Parameters

No parameters need to be passed to this method.

Return type

dict

Returns

{
    "name": "string",
"output_data_metadata": {
    <table_name>: {
            "description": "string"
    }
},
"schedules": List[string],
"model_names": List[string]
}

Response structure

(dict)

  • –name (string)

    Indicates the name of the project used during client initialization.

  • –<table_name> (string)

    Indicates the table name for the CDP table associated with the project. This is available only if output data linked to the project is written.

  • –description (string)

    Indicates the description provided while writing the output data, if any.

  • –schedules (str)

    Indicates the list of all schedule names linked to the project, if any.

  • –model_names (str)

    Indicates the list of all model names that are created in the project but cannot be used in other projects.

Exceptions/errors

No specific exception occurs for this method.

list_project_names()

Lists all projects available in the ML Studio instance.

Request syntax

response = client.list_project_names()

Parameters

No parameters need to be passed to this method.

Return type

List[string]

Response structure

[string, string, ........]

Exceptions/Errors

No specific exception occurs for this method.

CDP Data

delete_output_data()

Deletes the data in CDP from the table associated with the project and frees up one of the five available output spots.

Note

This method removes any custom model scores that are written in the linked project. Use this method if you are absolutely sure that you want to delete the custom model scores.

Request syntax

response = client.delete_output_data()

Parameters

No parameters need to be passed to this method.

Return type

Returns nothing

Exceptions/errors

  • ProjectOutputDoesNotExistException: Occurs if output data isn’t written for the project at all.
  • ProjectBaseException: Occurs to catch generic exceptions.

describe_output_data()

Describes the output CDP data table associated with the current project. The details for each output consist of an optional description.

Request syntax

response = client.describe_output_data()

Parameters

No parameters need to be passed to this method.

Return type

dict

Returns

{
      <table_name>: {
             "description": "string"
      }
}

Response structure

(dict)

  • –<table_name> (dict)

    Indicates the table name for the CDP table associated with the project.

  • –description (str)

    Indicates the description provided while writing the output data.

Exceptions/errors

ValueError: Occurs if output data isn’t written for the project at all.

list_available_tables()

Lists all CDP tables available to query.

Request syntax

response = client.list_available_tables()

Parameters

No parameters need to be passed to this method.

Return type

Pandas DataFrame

Returns

A pandas dataframe with the available tables and their metadata.

Response structure

(pandas.DataFrame)

Some important columns:

  • –created_on (timestamp)

    Indicates the table creation datetime. This value usually matches the Metrics data refresh datetime.

  • –name (str)

    Indicates the name of the CDP table.

Exceptions/errors

No specific exception occurs for this method.

query_data(**kwargs)

Runs a select query on the available CDP tables. This method only allows running the query.

Request syntax

response = client.query_data(query="string")

Parameters

  • query (string): A select query to query any of the available tables in CDP.

Return type

pandas DataFrame

Returns

A pandas dataframe with the available tables and their metadata.

Response structure

(pandas.DataFrame)

All columns that are in the provided query.

Exceptions/errors

ValueError: Occurs if one of the following is qords in the query:

"DROP", "TRUNCATE", "ALTER", "INSERT", "CREATE", "UPDATE", "DELETE", "REMOVE"

Examples

qry = """
 SELECT MasterCustomerID, TransactionID, TransactionDate, SaleRevenue
 FROM transactionsummary
 """

 df = client.query_data(query=qry)

read_output_data()

Reads the contents of the CDP ML Studio table linked to the project.

Request syntax

response = client.read_output_data()

Parameters

No parameters need to be passed to this method.

Return type

pandas DataFrame

Returns

A pandas dataframe with all the columns in the CDP ML Studio table linked to the project.

Response structure

(pandas.DataFrame)

  • –MASTERCUSTOMERID (string)

    Indicates the MasterCustomerID of the record for which a model score is stored. This matches the MasterCustomerID in CDP.

  • –SCORE (double)

    Indicates the prediction of the custom model, depending on how the model is built. For example, for likelihood models, this is the probability of a certain event.

  • –VALUE_NAME (string)

    Indicates the business friendly name for the model prediction. Typically, Actions, Metrics, and 360 Profile display this value.

  • –ROW_CREATED (timestamp)

    Indicates the timestamp of record creation.

Exceptions/errors

ValueError: Occurs if the system tries to read output data without writing anything in the data.

write_output_data(**kwargs)

Writes data to the CDP ML Studio table linked to the project. This method writes to the same table when you use a client initialized with a particular project name. There are five ML Studio tables available to write to. Therefore, you can have five projects in your notebook instance.

Request syntax

response = client.write_output_data(df="pandas.DataFrame", description="string")

Parameters

  • df (pandas.DataFrame) [REQUIRED]:

    The pandas dataframe must have the following columns, and the column names must be in uppercase:

    MASTERCUSTOMERID, SCORE, VALUE_NAME, ROW_CREATED.

  • description (string) [OPTIONAL]:

    Default = “”

    An optional field to describe the model scores that are written. This is retrievable through described output data.

Return type

dict

Returns

Response structure

(dict)

  • –success (boolean)

    Indicates whether the output data was written or not.

  • –num_rows_written (integer)

    Indicates the number of rows written to the CDP ML Studio table linked with the project.

Exceptions/Errors

  • ValueError: Occurs when the dataframe does not have the columns:

    MASTERCUSTOMERID, SCORE, VALUE_NAME, ROW_CREATED

Model

list_models(**kwargs)

Lists all saved models that were created as part of the linked project. Provides an option to list all models that were created within the ML Studio instance, not limited to the linked project.

Request syntax

response = client.list_models(project_specific=boolean)

Parameters

  • project_specific (boolean) [OPTIONAL]:

    Default = True

    Set to False to only list models created in the linked project. If set to True, this lists all models created in the ML Studio instance and not limited to the linked project.

Return type

List[string]

Response structure

[string, string, ........]

Exceptions/Errors

No specific exception occurs for this method.

load_model(**kwargs)

Retrieves a saved model from the ML Studio instance.

Request syntax

response = client.list_models(model_name="string")

Parameters

  • model_name (string) [REQUIRED]:

    Name of the model to be retrieved. This can be any model returned by the list_models method.

Return type

Model Object

Returns

A trained ML object that was saved using save_model. Returns None if there is no model with that name.

Response structure

N/A

Exceptions/errors

No specific exception occurs for this method.

save_model(**kwargs)

Saves a trained model. Allows a model to be saved so that the user can return to the notebook at a later time without rerunning the training.

Request syntax

response = client.save_model(model="object", model_name="string", overwrite="boolean")

Parameters

  • model (string) [REQUIRED]:

    A trained ML model. Any model that can be saved using the Python pickle library. Uses the pickle.dumps method to save models.

  • model_name (string) [REQUIRED]:

    Name of the model.

  • overwrite (string) [OPTIONAL]:

    Default = False

    Set to True to overwrite an existing model with the same model name.

Return type

Returns nothing

Exceptions/errors

  • ValueError: Occurs if overwrite is set to False, and a model with the same name exists.

Schedule

add_schedule(**kwargs)

Schedules a notebook to run on an interval.

Request syntax

response = client.add_schedule(frequency="string", time="string", notebook_name="string", schedule_name="string")

Parameters

  • frequency (string) [REQUIRED]:

    The frequency at which the notebook is executed. Supported frequencies are daily, weekly, and monthly.

    Setting this to:

    • daily schedules it to run every day.
    • weekly schedules it to run on Monday of every week.
    • monthly schedules it to run on the 1st day of every month.
  • time (string) [REQUIRED]:

    Time of the day in UTC in the HH:MM format when the notebook is executed. This time should be in 24 Hr format. For example, 18:00 to indicate 6 PM UTC.

  • notebook_name (string) [REQUIRED]:

    The name of the notebook to be run on a schedule. This name should end with .ipynb.

  • schedule_name (string) [REQUIRED]:

    A name for the schedule. This can be used later to delete the schedule. This name must be unique.

Return type

Returns nothing

Exceptions/errors

  • ScheduleNotebookException: Occurs if:
    • Notebook is already scheduled. Only one schedule can be added for a notebook.
    • Schedule name already exists.

delete_schedule(**kwargs)

Deletes a schedule added in the ML Studio instance.

Note

This method removes any existing schedule that might be populating custom model scores used within CDP. Use this method if you are fine with not refreshing model scores in CDP using the schedule.

Request syntax

response = client.delete_schedule(schedule_name="string", notebook_full_path="string)

Parameters

  • schedule_name (string) [OPTIONAL]:

    Default = None

    Name of the schedule to be deleted.

  • notebook_name (string) [OPTIONAL]:

    Default = None

    The name of the notebook to be deleted. For example, my-notebook.ipynb.

Return type

Returns nothing

Exceptions/errors

  • ValueError: Occurs if schedule_name and notebook_full_path are None.
  • ScheduleDeleteException: Thrown if:
    • schedule_name does not exist.
    • The notebook full path is invalid.

describe_schedule(**kwargs)

Describes the schedule added in the ML Studio instance and not limited to the linked project. Includes the cron job schedule, the command which has the scheduled notebook, and the schedule name.

Request syntax

response = client.describe_schedule(schedule_name="string")

Parameters

  • schedule_name (string) [REQUIRED]:

    Name of the schedule to be deleted.

Return type

List[string]

Response Structure

[string, string, ........]

Exceptions/errors

  • ScheduleNotFoundException: Occurs if the schedule you want to describe does not exist.

list_schedules(**kwargs)

Lists the names of schedules. Includes their corresponding notebooks.

Request syntax

response = client.list_schedule_names(project_specific="boolean")

Parameters

  • project_specific (boolean) [OPTIONAL]:

    Default = True

    Set to True to list schedules added in the linked project. If set to False, this lists all schedules added in the ML Studio instance and not limited to the linked project.

Return type

List[string]

Response structure

[string, string, ........]

Exceptions/errors

No specific exception occurs for this method.

Surface to CDP

surface_output_data()

Executes a backend workflow to surface the custom models data to CDP.

Request syntax

response = client.surface_output_data()

Parameters

No parameters need to be passed to this method.

Return type

dict

Returns

{
 run_id: "string"
}

Response structure

(dict)

  • –run_id (string)

    Indicates the run ID of the backend workflow that was triggered by the method. It can be used with the check_surfacing_status method to track the status.

Exceptions/errors

  • MlApiTriggerWorkflowException: Occurs if the backend workflow can’t be triggered.

check_surfacing_status(**kwargs)

Queries the status of the backend workflow using a run id.

Request syntax

response = client.check_surfacing_status(run_id="string")

Parameters

  • run_id (string) [REQUIRED]:

    The run_id of the workflow. Typically, the value is retrieved when the workflow is triggered using the surface_output_data method.

Return type

dict

Returns

{
    run_id: "string",
    start_date: "string",
    state: "string"
}

Response structure

(dict)

  • –run_id (string)

    Indicates the run_id of the workflow whose status is checked.

  • –start_date (string)

    Indicates the string representation of timestamp when the workflow was triggered (Format = yyyy-MM-dd'T'HH:mm:ss.SSSSSSX).

  • –state (string)

    Indicates the status of execution. Options are running, success, and failed.

Exceptions/errors

  • MlApiCheckWorkflowStatusException: Occurs when there is an error fetching the status of the workflow.

update_output_name(**kwargs)

Updates the column names where the custom model scores are written in the CDP modules, such as Actions, Metrics, and 360 Profile.

Request syntax

response = client.update_output_name(display_name="string", description="string")

Parameters

  • display_name (string) [REQUIRED]:

    The name for the column where model scores are written in the CDP modules, such as Actions, Metrics, and 360 Profile.

  • description (string) [OPTIONAL]:

    Default = None

    An optional field to describe the column where the model scores are written. This is visible in the info tool tip in the CDP user interface.

Return type

Returns nothing

Exceptions/errors

  • MlApiCdpUpdateException: Occurs when there is an error updating column names in either of the CDP modules, such as Actions, Metrics, and 360 Profile.