跳到内容

rg.Query

要根据搜索条件收集记录,您可以使用 QueryFilter 类。Query 类用于定义搜索条件,而 Filter 类用于过滤搜索结果。Filter 传递给 Query 对象,因此您可以组合多个过滤器来创建复杂的搜索查询。Query 对象也可以传递给 Dataset.records,以根据搜索条件获取记录。

使用示例

搜索包含术语的记录

要搜索包含术语的记录,您可以将 Dataset.records 属性与查询字符串一起使用。搜索术语用于搜索在文本字段中包含这些术语的记录。

for record in dataset.records(query="paris"):
    print(record)

按条件过滤记录

Argilla 允许您根据条件过滤记录。您可以使用 Filter 类定义条件,并将它们传递给 Dataset.records 属性,以根据条件获取记录。条件包括 "==", ">=", "<=" 或 "in"。条件可以与点表示法结合使用,以根据元数据、建议或响应来过滤记录。

# create a range from 10 to 20
range_filter = rg.Filter(
    [
        ("metadata.count", ">=", 10),
        ("metadata.count", "<=", 20)
    ]
)

# query records with metadata count greater than 10 and less than 20
query = rg.Query(filters=range_filter, query="paris")

# iterate over the results
for record in dataset.records(query=query):
    print(record)

Query

此类用于将用户查询映射到内部查询模型

源代码在 src/argilla/records/_search.py
class Query:
    """This class is used to map user queries to the internal query models"""

    def __init__(
        self,
        *,
        query: Union[str, None] = None,
        similar: Union[Similar, None] = None,
        filter: Union[Filter, Conditions, None] = None,
    ):
        """Create a query object for use in Argilla search requests.add()

        Parameters:
            query (Union[str, None], optional): The query string that will be used to search.
            similar (Union[Similar, None], optional): The similar object that will be used to search for similar records
            filter (Union[Filter, None], optional): The filter object that will be used to filter the search results.
        """

        if isinstance(filter, tuple):
            filter = [filter]

        if isinstance(filter, list):
            filter = Filter(conditions=filter)

        self.query = query
        self.filter = filter
        self.similar = similar

    def has_search(self) -> bool:
        return bool(self.query or self.has_similar() or self.filter)

    def has_similar(self) -> bool:
        return bool(self.similar)

    def api_model(self) -> SearchQueryModel:
        model = SearchQueryModel()

        if self.query or self.similar:
            query = QueryModel()

            if self.query is not None:
                query.text = TextQueryModel(q=self.query)

            if self.similar is not None:
                query.vector = self.similar.api_model()

            model.query = query

        if self.filter is not None:
            model.filters = self.filter.api_model()

        return model

__init__(*, query=None, similar=None, filter=None)

创建用于 Argilla 搜索请求的查询对象。add()

参数

名称 类型 描述 默认值
query Union[str, None]

将用于搜索的查询字符串。

similar Union[Similar, None]

将用于搜索相似记录的 similar 对象

filter Union[Filter, None]

将用于过滤搜索结果的 filter 对象。

源代码在 src/argilla/records/_search.py
def __init__(
    self,
    *,
    query: Union[str, None] = None,
    similar: Union[Similar, None] = None,
    filter: Union[Filter, Conditions, None] = None,
):
    """Create a query object for use in Argilla search requests.add()

    Parameters:
        query (Union[str, None], optional): The query string that will be used to search.
        similar (Union[Similar, None], optional): The similar object that will be used to search for similar records
        filter (Union[Filter, None], optional): The filter object that will be used to filter the search results.
    """

    if isinstance(filter, tuple):
        filter = [filter]

    if isinstance(filter, list):
        filter = Filter(conditions=filter)

    self.query = query
    self.filter = filter
    self.similar = similar

Filter

此类用于将用户过滤器映射到内部过滤器模型

源代码在 src/argilla/records/_search.py
class Filter:
    """This class is used to map user filters to the internal filter models"""

    def __init__(self, conditions: Union[Conditions, None] = None):
        """ Create a filter object for use in Argilla search requests.

        Parameters:
            conditions (Union[List[Tuple[str, str, Any]], Tuple[str, str, Any], None], optional): \
                The conditions that will be used to filter the search results. \
                The conditions should be a list of tuples where each tuple contains \
                the field, operator, and value. For example `("label", "in", ["positive","happy"])`.\
        """

        if isinstance(conditions, tuple):
            conditions = [conditions]
        self.conditions = [Condition(condition) for condition in conditions]

    def api_model(self) -> AndFilterModel:
        return AndFilterModel.model_validate({"and": [condition.api_model() for condition in self.conditions]})

__init__(conditions=None)

创建用于 Argilla 搜索请求的过滤器对象。

参数

名称 类型 描述 默认值
conditions Union[List[Tuple[str, str, Any]], Tuple[str, str, Any], None]

将用于过滤搜索结果的条件。条件应为元组列表,其中每个元组包含字段、运算符和值。例如 ("label", "in", ["positive","happy"])

源代码在 src/argilla/records/_search.py
def __init__(self, conditions: Union[Conditions, None] = None):
    """ Create a filter object for use in Argilla search requests.

    Parameters:
        conditions (Union[List[Tuple[str, str, Any]], Tuple[str, str, Any], None], optional): \
            The conditions that will be used to filter the search results. \
            The conditions should be a list of tuples where each tuple contains \
            the field, operator, and value. For example `("label", "in", ["positive","happy"])`.\
    """

    if isinstance(conditions, tuple):
        conditions = [conditions]
    self.conditions = [Condition(condition) for condition in conditions]

Similar

此类用于将用户相似性查询映射到内部查询模型

源代码在 src/argilla/records/_search.py
class Similar:
    """This class is used to map user similar queries to the internal query models"""

    def __init__(self, name: str, value: Union[Iterable[float], "Record"], most_similar: bool = True):
        """
        Create a similar object for use in Argilla search requests.

        Parameters:
            name: The name of the vector field
            value: The vector value or the record to search for similar records
            most_similar: Whether to search for the most similar records or the least similar records
        """

        self.name = name
        self.value = value
        self.most_similar = most_similar if most_similar is not None else True

    def api_model(self) -> VectorQueryModel:
        from argilla.records import Record

        order = "most_similar" if self.most_similar else "least_similar"

        if isinstance(self.value, Record):
            return VectorQueryModel(name=self.name, record_id=self.value._server_id, order=order)

        return VectorQueryModel(name=self.name, value=self.value, order=order)

__init__(name, value, most_similar=True)

创建用于 Argilla 搜索请求的 similar 对象。

参数

名称 类型 描述 默认值
name str

向量字段的名称

required
value Union[Iterable[float], Record]

向量值或用于搜索相似记录的记录

required
most_similar bool

是否搜索最相似的记录或最不相似的记录

True
源代码在 src/argilla/records/_search.py
def __init__(self, name: str, value: Union[Iterable[float], "Record"], most_similar: bool = True):
    """
    Create a similar object for use in Argilla search requests.

    Parameters:
        name: The name of the vector field
        value: The vector value or the record to search for similar records
        most_similar: Whether to search for the most similar records or the least similar records
    """

    self.name = name
    self.value = value
    self.most_similar = most_similar if most_similar is not None else True