@misc{zhang2024autocoderover,
title={AutoCodeRover: Autonomous Program Improvement},
author={Yuntong Zhang and Haifeng Ruan and Zhiyu Fan and Abhik Roychoudhury},
year={2024},
eprint={2404.05427},
archivePrefix={arXiv},
primaryClass={cs.SE}
}
A SequenceInputStream represents the logical concatenation of other input streams. It starts out with an ordered collection of input streams and reads from the first one until end of file is reached, whereupon it reads from the second one, and so on, until end of file is reached on the last of the contained input streams.
pgloader will keep a separate file of rejected data, but continue trying to copy good data in your database.
pgloader also implements data reformatting, a typical example of that being the transformation of MySQL datestamps 0000-00-00 and 0000-00-00 00:00:00 to PostgreSQL NULL value
A very common workflow is to index some data based on its embeddings and then given a new query embedding retrieve the most similar examples with k-Nearest Neighbor search. For example, you can imagine embedding a large collection of papers by their abstracts and then given a new paper of interest retrieve the most similar papers to it.
TLDR in my experience it ~always works better to use an SVM instead of kNN, if you can afford the slight computational hit
["slug" being an entity attribute]
Spring Data offers an existsBy query method, which we can define in the PostRepository, as follows:
1
2
3
4
5
6
@Repository
public interface PostRepository
extends JpaRepository<Post, Long> {
boolean existsBySlug(String slug);
}
[another] option to emulate existence is using a CASE WHEN EXISTS native SQL query:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
@Repository
public interface PostRepository
extends JpaRepository<Post, Long> {
@Query(value = """
SELECT
CASE WHEN EXISTS (
SELECT 1
FROM post
WHERE slug = :slug
)
THEN 'true'
ELSE 'false'
END
""",
nativeQuery = true
)
boolean existsBySlugWithCase(@Param("slug") String slug);
}
@Repository
public interface PostRepository extends BaseJpaRepository<Post, Long> {
@Query("""
select p
from Post p
where date(p.createdOn) >= :sinceDate
"""
)
@QueryHints(
@QueryHint(name = AvailableHints.HINT_FETCH_SIZE, value = "25")
)
Stream<Post> streamByCreatedOnSince(@Param("sinceDate") LocalDate sinceDate);
}
The FETCH_SIZE JPA query hint is necessary for PostgreSQL and MySQL to instruct the JDBC Driver to prefetch at most 25 records. Otherwise, the PostgreSQL and MySQL JDBC Drivers would prefetch all the query results prior to traversing the underlying ResultSet.
When Hibernate loads an object into a Session it creates a state snapshot of the current database state of the object, so that it can perform dirty checking against the snapshot.
As a read only object will never be modified, this snapshot is not needed and memory can be saved.
E. Pinheiro, W. Weber, und L. Barroso. Proceedings of the 5th USENIX Conference on File and Storage Technologies, Seite 2--2. Berkeley, CA, USA, USENIX Association, (2007)
T. Reps, S. Horwitz, und M. Sagiv. Proceedings of the 22Nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Seite 49--61. New York, NY, USA, ACM, (1995)