跳转到主要内容

This Week in Databend #74

· 4 分钟读完

Databend is a powerful cloud data warehouse. Built for elasticity and efficiency. Free and open. Also available in the cloud: https://app.databend.com .

What's New

Check out what we've done this week to make Databend even better for you.

Features & Improvements ✨

Meta

  • remove stream when a watch client is dropped (#9334)

Planner

  • support selectivity estimation for range predicates (#9398)

查询

  • support copy on error (#9312)
  • support databend-local (#9282)
  • external storage support location part prefix (#9381)

Storage

  • rangefilter support in (#9330)
  • try to improve object storage io read (#9335)
  • supprot table compression (#9370)

Metrics

  • add more metrics for fuse compact and block write (#9399)

Sqllogictest

  • add no-fail-fast support (#9391)

Code Refactoring 🎉

*

  • adopt rustls entirely, removing all deps to native-tls (#9358)

Format

  • remove format_xxx settings (#9360)
  • adjust interface of FileFormatOptionsExt (#9395)

Planner

  • remove SyncTypeChecker (#9352)

查询

  • split fuse source to read data and deserialize (#9353)
  • avoid io copy in read parquet data (#9365)
  • add uncompressed buffer for parquet reader (#9379)

Storage

  • add read/write settings (#9359)

Bug Fixes 🔧

Format

  • fix align_flush with header only (#9327)

Settings

  • use logical CPU number as default value of num_cpus (#9396)

处理器

  • the data type on both sides of the union does not match (#9361)

HTTP Handler

  • false alarm (warning log) about query not exists (#9380)

Sqllogictest

  • refactor sqllogictest http client and fix expression string like (#9363)

What's On In Databend

Stay connected with the latest news about Databend.

Introducing databend-local​

Inspired by clickhouse-local, databend-local allows you to perform fast processing on local files, without the need of launching a Databend cluster.

> export CONFIG_FILE=tests/local/config/databend-local.toml
> cargo run --bin=databend-local -- --sql="SELECT * FROM tbl1" --table=tbl1=/path/to/databend/docs/public/data/books.parquet

exec local query: SELECT * FROM tbl1
+------------------------------+---------------------+------+
| title | author | date |
+------------------------------+---------------------+------+
| Transaction Processing | Jim Gray | 1992 |
| Readings in Database Systems | Michael Stonebraker | 2004 |
| Transaction Processing | Jim Gray | 1992 |
| Readings in Database Systems | Michael Stonebraker | 2004 |
+------------------------------+---------------------+------+
4 rows in set. Query took 0.015 seconds.

Learn More

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

Compressing Short Strings​

When processing the same queries with short strings involved, Databend usually reads more data than other databases, such as Snowflake.

SELECT SearchPhrase, MIN(URL), COUNT(*) AS c FROM hits WHERE URL LIKE '%google%' AND SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY c DESC LIMIT 10;

Such queries might be more efficient if short strings (URLs, etc) are compressed.

Issue 9001: performance: compressing for short strings

Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Contributors

Thanks a lot to the contributors for their excellent work this week.

andylokandyariesdevilBohuTANGdantengskydrmingdrmereastfisher
andylokandyariesdevilBohuTANGdantengskydrmingdrmereastfisher
everpcpcleiyskymergify[bot]PsiACERinChanNOWWWsoyeric128
everpcpcleiyskymergify[bot]PsiACERinChanNOWWWsoyeric128
sundy-liXuanwoxudong963youngsofunzhang2014zhyass
sundy-liXuanwoxudong963youngsofunzhang2014zhyass

Connect With Us

We'd love to hear from you. Feel free to run the code and see if Databend works for you. Submit an issue with your problem if you need help.

DatafuseLabs Community is open to everyone who loves data warehouses. Please join the community and share your thoughts.