Announcing Optuna 3.1

Published in

Optuna

6 min readJan 18, 2023

We are pleased to announce the release of Optuna 3.1! Along with 3.1, we are releasing new sampling algorithms, new storage backends, and lots of performance improvements. This blog will walk you through those updates and explain how the project evolved from version 3.0. Let’s take a closer look at these new and improved features.

Summary:

New CMA-ES variant algorithm for mixed-integer search spaces
New file storage backend that is compatible with NFS (Network File System), allowing for scaling to multiple machines
Brand-new Redis storage backend that resolves various concurrency issues
Dask.distributed Integration for easy scaling to multiple machines
New sampling algorithm for exhaustive search
Improved performance of TPESampler’s constant_liar option
SciPy has become an optional dependency of Optuna, reducing 50% of Optuna’s import time
The new UI for Optuna Dashboard

The Python versions supported by this release are 3.7–3.11.

✨ What’s New

CMA-ES with Margin

Optuna provides CMA-ES via CmaEsSampler, which is especially powerful in high-dimensional continuous spaces. However, Optuna’s default handling of discrete search spaces (e.g., FloatDistribution with step and IntDistribution) has a problem of being fixed to local solutions when the variance becomes small during optimization. CMA-ES with margin is an algorithm to tackle this problem by introducing a lower bound on the marginal distribution of each discrete space.

We took a benchmark of CMA-ES with margin on HPO-bench problems and confirmed that it outperforms the original CMA-ES significantly when the number of trials is large (400–600+). Now, this feature can be easily enabled by specifying the with_margin=True argument. If you use CMA-ES for problems including discrete search spaces, please try it out!

See #4016 for details.

Distributed Optimization via NFS (Network File System)

JournalStorage is a new operation-based logging storage. Unlike conventional storages, which store the “values”, JournalStorage stores the “operation logs”. This storage allows developers to implement custom storage backends with simpler APIs.

We have introduced JournalFileStorage, a new storage backend for file systems. It can be used on NFS, allowing Optuna programs to scale to multiple machines. This can be especially useful for users who want to use Optuna in environments where it is difficult to set up database servers such as MySQL, PostgreSQL or Redis, which is otherwise often required for distributed optimization (e.g. #815, #1330, #1457 and #2216).

import optuna
from optuna.storages import JournalStorage, JournalFileStorage

def objective(trial):
    x = trial.suggest_float("x", -100, 100)
    y = trial.suggest_float("y", -100, 100)
    return x**2 + y
 
storage = JournalStorage(JournalFileStorage("./journal.log"))
study = optuna.create_study(storage=storage)
study.optimize(objective)

A Brand-New Redis Storage

Redis is a widely used open source software as an in-memory data store, and Optuna provided RedisStorage for tight integration with Redis. This was introduced in v1.4.0, but remained an experimental feature due to lack of performance, applicability validation and several bugs in parallel execution.

We have removed the existing RedisStorage and introduced a brand-new Redis backend storage JournalRedisStorage in this release. JournalRedisStorage is one of the JournalStorage backends introduced with this release, and is easy to use, simpler than RedisStorage, and powerful enough to work well in distributed environments.

The following is a brief description of how to use it; as long as you have a Redis server, you can use it in exactly the same way as you would use regular storage. See #4086 and #4102 for more information.

import optuna
from optuna.storages import JournalStorage, JournalRedisStorage

def objective(trial):
    x = trial.suggest_float("x", -100, 100)
    y = trial.suggest_float("y", -100, 100)
    return x**2 + y
 
storage = JournalStorage(JournalRedisStorage("redis://localhost:6379"))
study = optuna.create_study(storage=storage)
study.optimize(objective)

Dask.distributed Integration & Enhanced Concurrency Support

Enhanced Multi-threading Support
The concurrency support in Optuna is greatly improved for developers who are familiar with the concurrent.futures module. In previous versions, you could only use the n_jobs=<int> option to call study.optimize in multiple threads (e.g. study.optimize(objective, …, n_jobs=5) ). From v3.1, you can also use ThreadPoolExecutor as follows:

import optuna
from concurrent.futures import ThreadPoolExecutor

def objective(trial):
    x = trial.suggest_float("x", -100, 100)
    y = trial.suggest_float("y", -100, 100)
    return x**2 + y

study = optuna.create_study()
with ThreadPoolExecutor(max_workers=5) as pool:
    for i in range(5):
        pool.submit(study.optimize, objective, n_trials=10)
print(f"Best params: {study.best_params}")

Dask.distributed Integration
DaskStorage, a new storage backend based on Dask.distributed, is supported. It allows you to leverage distributed capabilities in similar APIs with concurrent.futures. DaskStorage can be used with InMemoryStorage, so you don’t need to set up a database server. Here’s a code example showing how to use DaskStorage:

import optuna
from optuna.storages import InMemoryStorage
from optuna.integration import DaskStorage
from distributed import Client, wait

def objective(trial):
    ...

with Client("192.168.1.8:8686") as client:
    study = optuna.create_study(storage=DaskStorage(InMemoryStorage()))
    futures = [
        client.submit(study.optimize, objective, n_trials=10, pure=False)
        for i in range(10)
    ]
    wait(futures)
    print(f"Best params: {study.best_params}")

Setting up a Dask cluster is easy: install dask and distributed, then run the dask scheduler and dask worker commands, as detailed in the Quick Start Guide in the Dask.distributed documentation.

$ pip install dask distributed

$ dask scheduler
INFO - Scheduler at: tcp://192.168.1.8:8686
INFO - Dashboard at:                  :8687
…

$ dask worker tcp://192.168.1.8:8686
$ dask worker tcp://192.168.1.8:8686
$ dask worker tcp://192.168.1.8:8686

See the documentation for more information.

BruteForceSampler

BruteForceSampler, a new sampler for brute-force search, tries all combinations of parameters. In contrast to GridSampler, this sampler does not require any additional arguments and can handle conditional search spaces. It constructs the search space in the define-by-run style while processing the objective function. BruteForceSampler tries combinations in random order and also works with parallel optimization. To use this sampler, the search space must be discrete, i.e., suggest_float with step=None is not allowed.

import optuna

def objective(trial):
    c = trial.suggest_categorical("c", ["float", "int"])
    if c == "float":
        return trial.suggest_float("x", 1, 3, step=0.5)
    elif c == "int":
        a = trial.suggest_int("a", 1, 3)
        b = trial.suggest_int("b", a, 3)
        return a + b

study = optuna.create_study(sampler=optuna.samplers.BruteForceSampler())
study.optimize(objective)

See the document for more information.

🚀 Other Improvements

Bug Fix for TPE’s constant_liar Option

The constant_liar option in the TPESampler promotes exploration in distributed optimization by adding all running trials to the “above” split of the TPE algorithm. However, there was a bug in the TPE logic that counted running trials in the population to be classified into the below/above splits. Under certain conditions, this caused some running trials or some finished trials that returned early to be unconditionally classified into the “below” split, which led to performance degradation in early trials.

We have fixed this bug and have confirmed that it improves the constant_liar performance on HPOBench. For more information and performance benchmarks, please see #4073.

Make SciPy Dependency Optional

50% time of import optuna is consumed by SciPy-related modules. Also, it consumes 110MB of storage space, which is really problematic in environments with limited resources such as serverless computing.

We decided to implement scientific functions on our own to make the SciPy dependency optional. Thanks to contributors’ effort on performance optimization, our implementation is as fast as the code with SciPy although ours is written in pure Python. See #4105 for more information.

Note that QMCSampler still depends on SciPy. If you use QMCSampler, please explicitly specify SciPy as your dependency.

The New UI for Optuna Dashboard

We are developing a new UI for Optuna Dashboard that is available as an opt-in feature from the beta release — simply launch the dashboard as usual and click the link to the new UI. Please try it out and share your thoughts with us.

$ pip install "optuna-dashboard>=0.9.0b2"

Feedback Survey: The New UI for Optuna Dashboard

What’s Ahead

While we initially planned three main development targets in Optuna v3.1 Roadmap, we ended up with more than what was anticipated. Several storages and performance improvements described above were not on the roadmap initially but were developed by the growing share of contributors.

Main development targets for the upcoming Optuna v3.2 are still in the planning stages. We continually review what features to include and listen to community inputs on new areas where Optuna can help. You are welcome to submit feature requests on GitHub issues.

Contributors

As with any other release, this one would not have been possible without the feedback, code, and comments from many contributors.

@Abelarm, @Alnusjaponica, @ConnorBaker, @Hakuyume, @HideakiImamura, @Jasha10, @amylase, @belldandyxtq, @c-bata, @contramundum53, @cross32768, @erentknn, @eukaryo, @g-votte, @gasin, @gen740, @gonzaload, @halucinor, @himkt, @hrntsm, @hvy, @jmsykes83, @jpbianchi, @jrbourbeau, @keisuke-umezawa, @knshnb, @mist714, @ncclementi, @not522, @nzw0301, @rene-rex, @reyoung, @semiexp, @shu65, @sile, @toshihikoyanase, @wattlebirdaz, @xadrianzetx, @zaburo-ch

Thanks to those who have followed the projects from the very early days and those who have joined along the way.

Next Step

Check out the release notes for more information. To get the latest news from Optuna, follow us on Twitter. For feedback, please file an issue or create a pull request on GitHub.